0% found this document useful (0 votes)
17 views

Stats 2 GA

Uploaded by

divijaiwanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Stats 2 GA

Uploaded by

divijaiwanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 251

Statistics for Data Science - 2

Practice Assignment 0.2.1 Solution


Events and probabilities

1. A customer will purchase a shirt with probability 0.5. The customer will purchase a
pant with probability 0.4 and will purchase both a shirt and a pant with probability 0.2.
What is the probability that the customer will purchase neither a shirt nor a pant?
Solution:
Let A be the event that the customer will purchase a shirt and B be the event that the
customer will purchase a pant.
Given that, P (A) = 0.5 and P (B) = 0.4.
Also given that the customer will purchase both a shirt and a pant with probability 0.2.
i.e. P (A ∩ B) = 0.2.
We have to find the probability that the customer will purchase neither a shirt nor a
pant i.e. P (AC ∩ B C ).
We know that P (AC ∩ B C ) = P ((A ∪ B)C ) = 1 − P (A ∪ B)
And, P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.5 + 0.4 − 0.2 = 0.7
⇒ P (AC ∩ B C ) = 1 − P (A ∪ B) = 1 − 0.7 = 0.3

2. Suppose that we roll a pair of fair dice, so each of the 36 possible outcomes is equally
likely. Let A denote the event that the first die shows 5, B be the event such that the
sum of the outcomes of rolling the pair of dice is 10, and C be the event such that the
sum of the outcomes of rolling the pair of dice is 7. Then

a) Event A and event B are independent.


b) Event A and event B are not independent.
c) Event A and event C are independent.
d) Event A and event C are not independent.

Solution:
We are rolling a pair of fair dice and all the 36 outcomes is equally likely that means
probability of occurring each outcome is same i.e. 1/36.
A is the event that the first die shows 5.
⇒ A = {(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6)}
B is the event that the sum of the outcomes of rolling the pair of dice is 10.
⇒ B = {(4, 6), (5, 5), (6, 4)}
C is the event that the sum of the outcomes of rolling the pair of dice is 7.
⇒ C = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
Also, A ∩ B = {(5, 5)} and A ∩ C = {(5, 2)}
Since each outcome is equally likely, so

1
6 3 6 1 1
P (A) = 36
, P (B) = 36
, P (C) = 36
, P (A ∩ B) = 36
and P (A ∩ C) = 36

1
Since P (A ∩ B) = 36
6= P (A)P (B) ⇒ event A and B are not independent.

1
Also, P (A ∩ C) = 36 = 61 × 16 = P (A)P (C) ⇒ event A and C are independent.
Hence, option (b) and (c) are correct.

3. Let A and B be two independent events of a random experiment. Then, which of the
following is/are always true?

a) P (A ∪ B) = P (A)P (B) + P (B)


b) P (A ∪ B) = P (A)P (B C ) + P (B)
c) P (A ∪ B) = P (A) + P (B)
d) P ((A ∩ B)|A) = P (B)

Solution:
Given that A and B are two independent events ⇒ P (A ∩ B) = P (A)P (B).

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
= P (A) + P (B) − P (A)P (B)
= P (A)[1 − P (B)] + P (B)
= P (A)P (B C ) + P (B)

Therefore, option (b) is correct.


Consider
P ((A ∩ B) ∩ A)
P ((A ∩ B)|A) =
P (A)
P (A ∩ B)
=
P (A)
P (A)P (B)
=
P (A)
= P (B)

This implies that option (d) is also correct.


Hence, option (b) and (d) are correct.

4. The probability that a student registered for IITM online degree program will pass the
qualifier exam is 0.6 independent of all other students. Find the probability that out of
10,000 registered students, 7,000 students will pass the qualifier exam.

2
a) (0.6)3000 (0.4)7000
b) (0.6)7,000 (0.4)3,000
10,000
c) C7,000 (0.6)3,000 (0.4)7,000
10,000
d) C7,000 (0.6)7,000 (0.4)3,000

Solution:
Probability(p) that the student registered for IITM online degree program will pass the
qualifier exam is 0.6.
We have to find the probability that out of 10,000 registered students, 7,000 students will
pass the qualifier exam and passing qualifier exam for any student will be independent
of the other.
So here we can use binomial distribution with X will be number of students who will
pass the exam along with p = 0.6, n = 10, 000, and k = 7, 000.
And we know that for binomial distribution P (X = k) = n Ck pk (1 − p)(n−k)

⇒ P (X = 7, 000) = 10,000 C7,000 (0.6)7,000 (1 − 0.6)(10,000−7,000)

⇒ P (X = 7, 000) = 10,000 C7,000 (0.6)7,000 (0.4)3,000

Hence, probability that out of 10,000 registered students, 7,000 students will pass the
qualifier exam is 10,000 C7,000 (0.6)7,000 (0.4)3,000 .

5. Assume that the probability of a defective computer component is 0.05. Components


are randomly selected for being tested(assume that the testing is 100% accurate). Find
the probability that the first defect is observed when the sixth component is tested.

a) (0.05)6 × 0.95
b) (0.95)6 × 0.05
c) (0.95)5 × 0.05
d) (0.05)5 × 0.95

Solution:
We have to find the probability that the first defect is observed when the sixth compo-
nent is tested.
The probability of a defective computer component is 0.05.
Here we can assume that getting a defective component is success. That means we have
to find the probability of first success at 6th trials with p given as 0.05.
So here we can use geometric distribution with X representing the number of compo-
nents tested along with p = 0.05
And we know that for geometric distribution P (X = k) = (1 − p)k−1 p.

3
⇒ P (X = 6) = (1 − 0.05)6−1 × 0.05

⇒ P (X = 6) = (0.95)5 × 0.05

Hence the probability that the first defect is observed when the sixth component is tested
is (0.95)5 × 0.05.

6. If Aarushi and Ansh play a game of chess, Aarushi wins with probability 0.5 and Ansh
wins with probability 0.4 and the game ends in a draw with probability 0.1, independent
of all other games. They agree to play a match consisting of 5 games. Find the proba-
bility that Aarushi wins 4-1 (win gives 1 pt to winner and draw gives 0.5 pts to both).
Enter your answer correct to 3 decimals accuracy.
Solution:
Let Ai be the event that Aarushi will win the ith game and Bj be the event that Ansh
will win the jth game.
From given information we have P (Ai ) = 0.5, P (Bj ) = 0.4
There are two disjoint ways that Aarushi wins 4-1.
i) Aarushi wins 4 games and Ansh wins one game.
Probability of happening this will be 5 C4 (0.5)4 × 0.4 = 0.125
ii) Aarushi wins 3 games and 2 games are drawn.
Probability of happening this will be 5 C3 (0.5)3 × (0.1)2 = 0.0125
So, the probability that Aarushi wins 4-1 is 0.125 + 0.0125 = 0.1375

7. The probability of someone catching flu in a particular winter when they have been
given the flu vaccine is 0.2. Without the vaccine, the probability of catching flu is 0.5. If
40% of the population has been given the vaccine, what is the probability that a person
chosen at random from the population will catch flu over that winter? Enter the answer
correct to 2 decimals accuracy.
Solution:
Let A be the event that the person will catch flu and V be the event that the person
has been given the vaccine.
Given that P (A | V ) = 0.2, P (A | V C ) = 0.5 and P (V ) = 0.4
We have to find the probability that a person chosen at random from the population
will catch flu over that winter i.e. P (A).
And we can write P (A) = P (A | V )P (V ) + P (A | V C )P (V C )
⇒ P (A) = 0.2 × 0.4 + 0.5 × (1 − 0.4)
⇒ P (A) = 0.38

8. Suppose you are playing a game of cards with your friend. Your friend is supposed
to give you 13 cards one by one. With a well-shuffled pack of 52 cards, what is the
probability that you are dealt a perfect hand(13 of one suit)?
13!
a)
52!
4
12! × 39!
b)
51!
13! × 39!
c)
51!
13! × 39!
d)
52!
Solution:
Your friend is supposed to give you 13 cards one by one. Need to find the probability
that you are dealt a perfect hand i.e. you have gotten 13 cards of one suit.
For the first card, it can be any card from the 52 cards so probability will be 1.
Once the first card is given to you, the probability for the second card to be of same suit
will be 12
51
because once the first card is given to you it will belong to one particular suit
and second card will be conditional on that.
11
Similarly for the third card, probability will be 50 .
Continue like this, we get that the probability that you are dealt a perfect hand is

12 11 10 9 8 7 6 5 4 3 2 1
=1× × × × × × × × × × × ×
51 50 49 48 47 46 45 44 43 42 41 40
12! × 39!
=
51!

9. A person has bought a bed from an online furniture store. The seller delivers the
disassembled bed parts along with some screws to assemble it. The probability of a
screw being defective is 0.1 independent of all other screws. To compensate for the
manufacturing error, the seller sends two extra screws in the package where the bed
needs exactly 8 screws to assemble. What is the probability that the buyer will be able
to assemble the bed? (Enter the answer correct to 4 decimal accuracy)
Solution:
Let X represents the number of screws that seller sends with the bed.
We need exactly 8 screws to assemble the bed and the seller sends two extra i.e. seller
sends ten screws.
The buyer will be able to assemble the bed if 8 screws are non - defective or 9 screws
are non - defective or 10 screws are non - defective out of the ten screws.
We can relate this with binomial distribution as X ∼ Binomial(10, p) where p is the
probability of a screw being non - defective and value of p will be 1 - 0.1 = 0.9
The buyer will be able to assemble the bed if at least 8 screws are non - defective.
So, the probability that the buyer will be able to assemble the bed is P (X ≥ 8).

5
And

P (X ≥ 8) = P (X = 8) + P (X = 9) + P (X = 10)
= 10 C8 (0.9)8 (0.1)2 + 10 C9 (0.9)9 (0.1)1 + 10 C10 (0.9)10 (0.1)0
= (0.9)8 [(0.1)2 × 45 + 10 × 0.9 × 0.1 + 0.81]
= (0.9)8 × 2.16
= 0.9298

10. In a pizza shop 40% of the customers order medium size pizza, 50% order small size
pizza, and 10% order large size pizza. Of those ordering medium size pizza 32 also ask to
add extra toppings. Of those ordering small size pizza 15 also ask to add extra toppings,
and of those ordering large size pizza 45 also ask to add extra toppings. Given that a
customer asked to add extra toppings, find the conditional probability that the customer
ordered a medium pizza.
15
a) 67
40
b) 67
12
c) 67
52
d) 67

Solution:
Let S, M and L denote the event that customer will order small, medium and large size
pizza, respectively.
Given that P (S) = 0.50, P (M ) = 0.40 and P (L) = 0.10.
Also, let T be the event that customer will ask to add extra toppings.
This implies that P (T | S) = 15 , P (T | M ) = 23 and P (T | L) = 54 .
We need to find P (M | T ).
And
P (M ∩ T )
P (M | T ) =
P (T )
P (T | M )P (M )
=
P (T | S)P (S) + P (T | M )P (M ) + P (T | L)P (L)
2
3
× 0.40
= 1
5
× 0.50 + 3 × 0.40 + 45 × 0.10
2

0.80 15
= ×
3 6.7
40
=
67

6
Statistics for Data Science - 2
Practice Assignment 0.2.2 Solution
Events and Probabilities

1. The probability that an electrical machine will work more than 5 years but less than
8 years is 0.6 and the probability that it will work at least 8 years is 0.1. What is the
probability that the machine will work for more than 5 years? [1 mark]
Solution:
Define events A and B as follows:
A = Event that electrical machine will work more than 5 years.
B = Event that electrical machine will work more than 8 years.
From the given information,

P (A \ B) = 0.6
P (B) = 0.1
Now,

A = (A \ B) ∪ (A ∩ B)
Note that A ∩ B = B
⇒A = (A \ B) ∪ B
⇒P (A) = P ((A \ B) ∪ B)
⇒P (A) = P (A \ B) + P (B) (Since, A \ B and B are disjoints events.)
⇒P (A) = 0.6 + 0.1 = 0.7

2. Five cards are drawn from a well-shuffled pack of playing cards with replacement. Find
the probability that there will be at least two aces. [1 mark]
 5
1
(a)
13
 5
12
(b)
13
 5  4
12 12
(c) 1 − −5
13 135
 5  
1 12
(d) 1 − −5
13 135
Solution:
Since, cards are drawn with replacement, probability of drawing ace in every draw will
4 1
be same and equal to =
52 13

P (There will be at least two aces) = 1 − P (There will be no ace) − P (There will be one ace)
 0  5 !  1  4 !
1 12 1 12
= 1 − 5 C0 − 5 C1
13 13 13 13
 5  4
12 12
=1− −5
13 135

3. Choose the correct statements for any two non empty events A and B. [2 mark]

(a) P (A \ B) = P (A) − P (B)


(b) P (A \ B) = P (A) − P (A ∩ B)
(c) P (A \ B) = P (A ∪ B) − P (B)
(d) If B ⊂ A, then P (A \ B) = P (A) − P (B)
(e) If A and B are disjoint events, then P (A \ B) = P (A)

Solution:
We know that A ∩ B ⊆ A,
Then by using subset property, we have

P (A) = P (A ∩ B) + P (A \ (A ∩ B))
⇒P (A) = P (A ∩ B) + P (A \ B) (Since,A \ (A ∩ B) = A \ B)
⇒P (A \ B) = P (A) − P (A ∩ B) ....(1)

Therefore, option (a) is not necessarily true while option (b) is correct.

From equation (1),

P (A \ B) = P (A) − P (A ∩ B)
= P (A) − [P (A) + P (B) − P (A ∪ B)] (By addition rule)
= P (A ∪ B) − P (B)

Therefore, option (c) is correct.

B ⊂ A ⇒ A ∩ B = B. ...(2)
From equation (1) and (2), we have
If B ⊂ A, then P (A \ B) = P (A) − P (B)
Therefore, option (d) is correct.

Page 2
If A and B are disjoint events, then P (A ∩ B) = 0 .. (3)
From equation (1) and (3), we have
If A and B are disjoint events, then P (A \ B) = P (A)
Therefore, option (e) is correct.

4. Let A, B, and C be three events of a random experiment such that A ∪ B ∪ C = S,


where S is the sample space. The probability that at least one of the events A or B will
occur is 21 . What is the value of P (C \ (A ∪ B))? [2 mark]
Solution:
Given that
A, B, and C are the three events of a random experiment such that

A∪B∪C =S ...(1)

And
1
P (A ∪ B) = ...(2)
2
Now, we know that (Proved in the previous question): P (A \ B) = P (A ∪ B) − P (B)
for any two events A and B. Using this, we have

P (C \ (A ∪ B)) = P (A ∪ B ∪ C) − P (A ∪ B)
= P (S) − P (A ∪ B)
1 1
=1− =
2 2

5. Two friends Ravi and Sonali are playing a game in which they are hitting a target in
rounds. In each round, both hit the target independent of each other with a probability
of 0.5. The first one who hits the target three times wins the game. What is the
probability that in the fifth round Sonali wins the game? [2 marks]

1. 6 × (0.5)5
2. 30 × (0.5)10
3. 96 × (0.5)5
4. 96 × (0.5)10

Solution:
Define Events A and B as follows:
A = Ravi hits the target.
B = Sonali hits the target.
Given that
P (A) = P (B) = 0.5 ...(1)

Page 3
Sonali will win in the fifth round if Sonali hits her target third time in the fifth round
and Ravi hits target 0 or 1 or 2 times out of five rounds.
Probability that Sonali will hit the target third time in her fifth round = 4 C2 (0.5)2 (0.5)3

= 6 × (0.5)5

Prbability that Ravi hits target 0 or 1 or 2 times out of five rounds = (5 C0 + 5 C1 +


5
C2 )(0.5)5

= 16 × (0.5)5

Therefore, Probability that Sonali wins in the fifth round = 6 × (0.5)5 × 16 × (0.5)5

= 96 × (0.5)10

6. A family has three children each of which is equally likely to be a boy or a girl indepen-
dently to each other. Let A be the event that at most one child is a boy. B be the event
that the family has at least one girl and one boy. C be the event that all three children
are of same-sex. Choose the correct options. [2 mark]

(a) A and B are independent events.


(b) A and C are independent events.
(c) B and C are independent events.
(d) B and C are disjoint events.

Solution:
Since, a family has three children each of which is equally likely to be a boy or a girl
independently to each other, sample space of gender of all three children (in the order
of elder to younger) will be

S = {bbb, bbg, bgb, gbb, bgg, gbg, ggb, ggg}

Where b and g stand for boy and girl, respectively.


Events A, B, and C are defined as:

A = At most one child is a boy


A = {bgg, gbg, ggb, ggg}

Since each outcome in S is equally likely, we have


Number of outcomes in A
P (A) =
Number of outcomes in S
4 1
P (A) = = ...(1)
8 2

Page 4
B = Family has at least one girl and one boy
B = {bbg, bgb, gbb, bgg, gbg, ggb}

Since each outcome in S is equally likely, we have


Number of outcomes in B
P (B) =
Number of outcomes in S
6 3
P (B) = = ...(2)
8 4

C = All three children are of same-sex


C = {bbb, ggg}

Since each outcome in S is equally likely, we have


Number of outcomes in C
P (C) =
Number of outcomes in S
2 1
P (C) = = ...(3)
8 4
Now,
A ∩ B = Event that family has at least one boy and one girl and at most one child is a boy
⇒ A ∩ B = Event that family has one boy and two girls
A ∩ B = {bgg, gbg, ggb}

Since each outcome in S is equally likely, we have


Number of outcomes in A ∩ B
P (A ∩ B) =
Number of outcomes in S
3
P (A ∩ B) = ...(4)
8

A ∩ C = Event that at most one child is a boy and all three children are of same sex.
⇒ A ∩ C = Event that all three children are girls
A ∩ C = {ggg}

Since each outcome in S is equally likely, we have


Number of outcomes in A ∩ C
P (A ∩ C) =
Number of outcomes in S
1
P (A ∩ C) = ...(5)
8

Page 5
B ∩ C = Family has at least one boy and one girl and all three children are of same sex.
⇒ B ∩ C = Empty event

P (B ∩ C) = 0 ...(6)
⇒B and C are disjoint events.

Option (c) is wrong and (d) is right.

From equation (1) and (4), we have


3 1 3
P (A ∩ B) = = . = P (A).P (B)
8 2 4
⇒ A and B are independent events

Option (a) is right.

From equation (1) and (5), we have


1 1 1
P (A ∩ C) = = . = P (A).P (C)
8 2 4
⇒ A and C are independent events

Option (b) is right.

7. In a town, 60% of the residents are eligible for voting in an election but only 80 % of the
eligible residents voted in the election. A person is randomly selected from the town.
What is the conditional probability that the person is eligible for the voting given that
he or she did not vote? [2 mark]
2
1.
13
3
2.
13
4
3.
13
6
4.
13
Define events A and B as follows:
A = randomly selected person is eligible for voting.

Page 6
B = randomly selected person has voted.
Given that
P (A) = 0.6
P (B|A) = 0.8
⇒ P (B C |A) = 0.2
Note that
P (B C |AC ) = 1
To find: P (A|B C )

P (B C |A).P (A)
p(A|B C ) =
P (B C |A).P (A) + P (B C |AC ).P (AC )
(0.2)(0.6)
=
(0.2)(0.6) + (1)(0.4)
0.12 3
= =
0.52 13

8. Urn A contains 3 red and 2 blue marbles while urn B contains 2 red and 8 blue marbles.
A fair coin is tossed. If the coin turns up head, a marble is chosen from urn A. If it
turns up tail, a marble is chosen from urn B. Suppose Shreya who tosses the coin gets
a red color marble. What is the conditional probability that the marble is drawn from
the urn A? (Answer the question correctly up to two decimal points.) [2 marks]

Solution:
Define the events as follows:
H = Coin turns up head.
T = Coin turns up tail.
R = Red marble is drawn.
B = Blue marble is drawn.

From the given information, we have

3
P (R|H) =
5
2
P (B|H) =
5
2 1
P (R|T ) = =
10 5
8 4
P (B|T ) = =
10 5

Page 7
Marble is drawn from urn A if the coin turns up head.

P (R|H).P (H)
P (H|R) =
P (R|H).P (H) + P (R|T ).P (T )
3 1
.
= 3 15 21 1
. + 5.2
5 2
3
=
4

9. Three different tasks were assigned to three persons A, B, and C. Previous records show
that A, B, and C will complete their tasks independent of each other with probabilities
of 12 , 23 , and 34 , respectively. If it is known that exactly two of them have completed their
tasks, then what is the conditional probability that A has not completed his task? [3
marks]
3
(a) 4
3
(b) 11
6
(c) 11
9
(d) 11

Define events A, B, and C as follows:


A = A has completed his task.
B = B has completed his task.
C = C has completed his task.
Given that
1
P (A) =
2
2
P (B) =
3
3
P (C) =
4
Let D be the event that exactly two of them have completed their tasks, then

D = (A ∩ B ∩ C C ) ∪ (A ∩ B C ∩ C) ∪ (AC ∩ B ∩ C)
P (D) = P ((A ∩ B ∩ C C ) ∪ (A ∩ B C ∩ C) ∪ (AC ∩ B ∩ C))
Since, A ∩ B ∩ C C , A ∩ B C ∩ C, and AC ∩ B ∩ C are disjoint events, we have
P (D) = P (A ∩ B ∩ C C ) + P (A ∩ B C ∩ C) + P (AC ∩ B ∩ C)
Since, A, B, and C are independent events, we have
P (D) = P (A)P (B)P (C C ) + P (A)P (B C )P (C) + P (AC )P (B)P (C)

Page 8
Now,

P (AC ∩ D)
P (AC |D) =
P (D)

P (AC ∩ B ∩ C)
=
P (A)P (B)P (C C ) + P (A)P (B C )P (C) + P (AC )P (B)P (C)

P (AC )P (B)P (C)


=
P (A)P (B)P (C C ) + P (A)P (B C )P (C) + P (AC )P (B)P (C)

1 2 3
. .
2 3 4
= 1 2 1 1 1 3
. .
2 3 4
+ . .
2 3 4
+ 12 . 23 . 43

6
=
11

10. There are twenty boxes out of which exactly fifteen contains gifts and five are empty.
Five boxes are removed randomly. Now, a person selects one box from the remaining
boxes, then what is the probability that the person selects the empty box? [3 marks]
(Hint: Consider all the cases of removing empty boxes and apply the law of total prob-
ability)
1
(a) 4
2
(b) 3
3
(c) 4
1
(d) 3

Solution:
Define the events A, B, C, D, E,and F as follows:
A = Removed boxes contain no empty box.
B = Removed boxes contain one empty box.
C = Removed boxes contain two empty boxes.
D = Removed boxes contain three empty boxes.
E = Removed boxes contain four empty boxes.
F = Removed boxes contain five empty boxes.
Let X be the event that person selects the empty box.

Page 9
P (X) = P (A).P (X|A) + P (B).P (X|B) + P (C).P (X|C) + P (D).P (X|D) + P (E).P (X|E)
+ P (F ).P (X|F )
15
C5 5 C0 5 15
C4 5 C1 4 15
C3 5 C2 3 15
C2 5 C3 2 15
C1 5 C4 1 15
C 0 5 C5 0
= 20 + 20 + 20 + 20 + 20 + 20
C5 15 C5 15 C5 15 C5 15 C5 15 C5 15
1
= (15015 + 27300 + 13650 + 2100 + 75)
15504 × 15
1
=
4

Page 10
Statistics for Data Science - 2
Practice Assignment 0.3.1 Solution
Discrete random variable

1. A random variable X is defined as the length of the hypotenuse of the right-angled tri-
angle whose other two sides are determined by the roll of two 6-sided dice. How many
values does X take? [1 mark]
Solution:
When two dice are rolled then there are a total of 36 outcomes.
The outcomes are:
{(1, 1) , (1, 2), ... , (1, 6),
(2, 1), (2, 2), ... , (2, 6),
...
...
(6, 1), (6, 2), ... , (6, 6)}

But the outcomes like (1, 2) (2, 1) will give the same length of the hypotenuse, hence a
total of 21 values are possible for the random variable X.

2. Two cards are drawn from a well shuffled pack of 52 cards one after other without
replacement. A random variable is defined as:
(
0 if both cards are of same color
X=
1 if both cards are of different color

Find the probability mass function of X. [1 mark]

x 0 1
(a) 1 12
fX (x) 13 13

x 0 1
(b) 1 1
fX (x) 2 2

x 0 1
(c) 25 26
fX (x) 51 51

x 0 1
(d) 12 13
fX (x) 25 25

Solution:
P (X = 0) = P (Both the cards are of same colors)
= P (First card is any one of 52 cards).P (2nd card is of same color as of 1st card)
25
= 1.
51

P (X = 1) = P (Both the cards are of different colors)


= P (First card is any one of 52 cards).P (2nd card is of different color as of 1st card)
26
= 1.
51

Hence, option (c) is right.

3. In a group of fifteen people, 8 people have blood group type O, 4 people have blood group
type A, and 3 people have blood group type B. If five people are selected randomly from
these fifteen people, then what is the probability that out of these five people 2 people
have blood group type O, 2 have blood group type A and one has blood group type B?
(Answer the question correct up to two decimal places.) [2 mark]

Solution:
Number of ways of selecting five people out of 15 = 15 C5
Number of ways of selecting 2 people of blood group of type O out of 8 people of blood
group of type O= 8 C2
Number of ways of selecting 2 people of blood group of type A out of 4 people of blood
group of type A= 4 C2
Number of ways of selecting 1 people of blood group of type B out of 3 people of blood
group of type B= 3 C1
8
C2 4 C2 3 C1
Therefore, required probability = 15 C
5
28 × 6 × 3
. = = 0.167
3003

4. Probability mass funcion of a discrete random variable X is given as:

x -2 -1 0 1 2
fX (x) a 0.2 b 0.1 0.2

Table: PMF of X

3
If P (X ≤ 1|X ≥ −1) = , then find the value of P (X = −2). [2 marks]
4
Solution:

Page 2
We know that

X
fX (x) = 1
x∈TX

⇒a + 0.2 + b + 0.1 + 0.2 = 1


⇒a + b = 0.5 ...(1)

From the given condition, we have

3
P (X ≤ 1|X ≥ −1) =
4
P (X ≤ 1, X ≥ −1) 3
⇒ =
P (X ≥ −1) 4
P ({−1, 0, 1}) 3
⇒ =
P ({−1, 0, 1, 2}) 4
b + 0.3 3
⇒ =
b + 0.5 4
⇒4b + 1.2 = 3b + 1.5
⇒b = 0.3 ...(2)

From equations (1) and (2), we have

a = 0.2
b = 0.3

P (X = −2) = a
⇒P (X = −2) = 0.2

5. Siberian seagulls migrate to Ganga river to escape harsh winter weather in the months
of October to March. It is seen that the number of Siberian seagulls reaching Ganga
river on one day in January is Poisson distributed with an average of 1000. What is the
probability that 650 seagulls will arrive on a given day of January? [2 marks]
e−650 (650)1000
(a)
650!
−650
e (650)1000
(b)
1000!
−1000
e (650)1000
(c)
650!

Page 3
e−1000 (1000)650
(d)
650!
Solution:
Let X be the number of Siberian seagulls migrating everyday near to Ganga river.
By given condition, we have

X ∼ Poisson(1000)

e−λ λx
P (X = x) =
x!
e−1000 (1000)650
⇒P (X = 650) =
650!

6. Probability mass function of a discrete random variable X is given as:

x -1 0 1 2 3
fX (x) 0.1 0.3 0.2 0.1 0.3

Table: PMF of X

If another random variable Y is defined as Y = X(X − 1), then find the smallest value
1 1
of y in the range of Y such that P (Y ≤ y) > and P (Y ≥ y) ≤ . [2 marks]
2 2
Solution:
Y is defined as Y = X(X − 1)

At X = −1, Y = −1(−2) = 2
At X = 0, Y = 0(−1) = 0
At X = 1, Y = 1(0) = 0
At X = 2, Y = 2(1) = 2
At X = 3, Y = 3(2) = 6

Therefore, TY = {0, 2, 6}

P (Y = 0) = P (X ∈ {0, 1}) = 0.3 + 0.2 = 0.5


P (Y = 2) = P (X ∈ {−1, 2}) = 0.1 + 0.1 = 0.2
P (Y = 6) = P (X = 3) = 0.3
Now,
P (Y ≤ 0) = P (Y = 0) = 0.5

Page 4
First required condition is not satisfied at Y = 0.

P (Y ≤ 2) = P (Y = 0) + P (Y = 2) = 0.5 + 0.2 = 0.7


Both the required conditions are satisfied at Y = 2.

7. Three friends toss three fair coins to decide who is going to pay for the dinner. The per-
son getting an outcome different from the other two outcomes will pay for the dinner. If
all three coins result in the same outcome, they will toss the coins again. If X denotes
the number of trials needed to decide who is going to pay, then what is the probability
that X is at most 3? (Answer the question correct up to two decimal places.) [2 marks]
Solution:
Let X be the number of trials to decide who is going to pay.
Sample space on tossing three coins are:
{HHH, HHT, HTH, THH, HTT, THT, TTH, TTT }
P (They will decide who is going to pay) = P ({ HHT, HTH, THH, HTT, THT, TTH }) =
6
8
= 43
P (They will not decide who is going to pay) = P ({ HHH, TTT }) = 28 = 14
X will take values as 1, 2, 3, 4, ...
and X ∼ Geometric( 34 )

P (X ≤ 3) = P (X = 1) + P (X = 2) + P (X = 3)
 2
3 1 3 1 3
= + . + .
4 4 4 4 4
= 0.98

6
8. Let X ∼ Uniform({1, 2, 3, ... n}). If the probability that X is an odd number is ,
11
then what can be the value of n? [2 marks]

(a) 11 only
(b) 12 only
(c) Any multiple of 11.
(d) Any odd multiple of 11.

Solution:
Since, X ∼ Uniform({1, 2, 3, ... n})
Let A be the event that X takes odd numbers.
Therefore,
number of outcomes in A
P (A) = ...(1)
number of outcomes in S
where S = {1, 2, 3, ...n}

Page 5
It is given that
6
P (A) = ...(2)
11
By equation (1) and (2), we have
n should be multiple of 11 and number of odd numbers less than or equal to n should
be multiple of 6.
This is possible only for n = 11.

9. The number of customers arriving per day at a certain automobile service facility is
assumed to follow a Poisson distribution with an average of 50 customers arriving each
day. Assume that number of customers on different days are independent. What is the
probability that exactly 40 customers will come for at least 5 days over a 30 days period?
[3 marks]
4 
X  x  30−x 
30 e−50 (50)40 e−50 (50)40
(a) 1 − Cx 40!
1− 40!
x=0
4
X  x  30−x 
30 e−50 (50)40 e−50 (50)40
(b) Cx 40!
1− 40!
x=0
 5  −50 (50)40 25

e−50 (50)40
(c) 30
C5 . 1 − e 40!
40!
−50 (50)40 5
   −50 40 25
(d) 30 C5 1 − e 40! . e 40!(50)

Solution:
Let X be the number of customers arriving per day at a certain automobile service
facility.
X ∼ P oisson(50)
e−50 5040
P (X = 40) =
40!
Let Y be the number of days in the next 30 days on which 40 customers have arrived
on that particular shop.
e−50 5040
 
Then, Y ∼ Binomial 30,
40!
Now,

P (Y ≥ 5) = 1 − P (Y < 5)
4 x  30−x !
e−50 (50)40 e−50 (50)40
X 
30
1− Cx 1−
40! 40!
x=0

Page 6
10. A biased coin with the probability of 0.4 of showing head is tossed until it shows either
two consecutive heads or two consecutive tails. If X denotes the number of tosses
required, what is the value of P (X = 5)? [3 marks]

(a) 0.03456
(b) 0.02304
(c) 0.01675
(d) 0.0576

Solution:
It is clear that

P (X = 5) = P (HTHTT) + P (THTHH)
= (0.4)2 (0.6)3 + (0.4)3 (0.6)2
= 0.0576

Page 7
Statistics for Data Science - 2

Practice Assignment 0.3.2 Solution


Discrete random variables

1. Toss a coin 50 times. Let the random variable X be defined as the number of tails
observed. Find the average of the values in the range of the random variable.
Solution:
Random variable X is defined as the number of tails observed while tossing the coin 50
times.
So the possible values taken by X is 0, 1, 2, 3 ....48, 49, 50.
⇒ Range of X = {0, 1, 2, 3....., 48, 49, 50}
Average of range values = sum of all values of range/ total number of values

0+1+2+3+.....+48+49+50 1275
⇒ Average of range values = 51
= 51
= 25

2. Suppose that 5 fruits are randomly chosen from a basket containing 20 fruits, of which
16 are good and 4 are rotten. Let Y denote the number of rotten fruits chosen. Find
the possible values taken by Y .

a) {1, 2, 3, 4, 5}
b) {0, 1, 2, 3, 4, 5}
c) {1, 2, 3, 4}
d) {0, 1, 2, 3, 4}

Solution:
Random variable Y is defined as the number of rotten fruits chosen from the basket
while drawing 5 fruits. Since there are only 4 rotten fruits, so Y cannot take values more
than 4. Also there are 16 good fruits, so while drawing fruits there can be 0 rotten fruit
or 1 rotten fruit or 2 rotten fruits or 3 rotten fruits or 4 rotten fruits.
Hence, the possible values taken by Y i.e Range = {0, 1, 2, 3, 4}.

3. Let X be the number of candies present in a box. We have the following information:
There are at most four candies in the box.
The probability of having 2 candies in the box is the same as the probability of having
one candy.
The probability of having no candy in the box is the same as the probability of having
3 candies.
The probability of having four candies is twice of the probability of having three candies
and four times of having two candies.
What will be the PMF of X?

1
X 0 1 2 3 4
a) 1 1 2 2 4
P (X = x) 10 10 10 10 10

X 0 1 2 3 4
b) 2 1 1 2 4
P (X = x) 10 10 10 10 10

X 0 1 2 3 4
c) 1 2 1 2 4
P (X = x) 10 10 10 10 10

X 0 1 2 3 4
d) 4 2 1 1 2
P (X = x) 10 10 10 10 10

Solution:
Given that there are at most four candies in the box, so X cannot take values more than
4.
Also given that
P (X = 2) = P (X = 1), P (X = 0) = P (X = 3), P (X = 4) = 2P (X = 3) and
P (X = 4) = 4P (X = 2).
Let P (X = 2) = p and P (X = 0) = q
⇒ 2q = 4p
⇒ q = 2p
And we know that P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) = 1
⇒ q + p + p + q + 2q = 1
⇒ 4q + 2p = 1
Using the above relation, we will get 4 × 2p + 2p = 1
⇒ p = 1/10 and hence q = 2/10.
So, P (X = 0) = 2/10, P (X = 1) = 1/10, P (X = 2) = 1/10, P (X = 3) = 2/10, and
P (X = 4) = 4/10.
Therefore, option b is the correct answer.

4. Let X be a discrete random variable with following probability mass function

X 0 1 2 3 4 5 6
P (X = x) 0 k 4k 6k 4k 10k 2 6k 2

Table: PMF of X

Find the value of P (X ≤ 4). Enter your answer correct up to 4 decimals accuracy.
Solution:
P6
We know that P (X = x) = 1
x=0
⇒ P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) + P (X = 5) + P (X = 6)
=1
⇒ 0 + k + 4k + 6k + 4k + 10k 2 + 6k 2 = 1

2
⇒ 16k 2 + 15k − 1 = 0
⇒ (16k − 1)(k + 1) = 0
⇒ k = −1 or k = 1/16
Since k cannot take negative values, so k must be 1/16.
Now,

P (X ≤ 4) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4)
= 0 + k + 4k + 6k + 4k
= 15k
1
= 15 ×
16
= 0.9375

5. I roll two fair six sided dice and observe the two outcomes. Let the random variables
Y and Z denote the outcomes observed on the two dice and let X = Y + Z. Find
P (Y = 3|X = 6).
Solution:
Y and Z denotes the outcomes observed on the two dice.
Given X = Y + Z, so the favourable outcomes for X = 6 will be {(1,5),(2,4),(3,3),(4,2),
(5,1)}.
From the reduced sample space the favourable outcomes for (Y = 3|X = 6) will be
{(3,3)}.
Hence, P (Y = 3|X = 6) = 51 = 0.2

6. Let X be a discrete random variable with following probability mass function




 0.2 for k = 0

0.3 for k = 1



P (X = k) = 0.4 for k = 2

0.1 for k = 3





0 otherwise.

Define Y = (X − 1)(X + 1)(X + 3). Find P (Y ≤ 32).


Solution:
Given that X is taking values 0, 1, 2 and 3 and Y = (X − 1)(X + 1)(X + 3).
Now we will calculate the values taken by Y corresponding to every value of X.
At X = 0
Y = (0 − 1)(0 + 1)(0 + 3) = −3
At X = 1
Y = (1 − 1)(1 + 1)(1 + 3) = 0
At X = 2
Y = (2 − 1)(2 + 1)(2 + 3) = 15

3
At X = 3
Y = (3 − 1)(3 + 1)(3 + 3) = 48
This implies that Y is taking values -3, 0, 15, and 48.
So,

P (Y ≤ 32) = P (Y = −3) + P (Y = 0) + P (Y = 15)


= P (X = 0) + P (X = 1) + P (X = 2)
= 0.2 + 0.3 + 0.4
= 0.9

7. A shopkeeper sells mobile phones. The demand for mobile phone follows a Poisson dis-
tribution with mean 4.6 per week. The shopkeeper has 5 mobile phones in his shop at
the beginning of a week. Find the probability that this will not be enough to satisfy the
demand for mobile phones in that week. Enter your answer correct up to two decimals
accuracy.
Solution:
The shopkeeper has 5 mobile phones in his shop at the beginning of a week. The shop-
keeper will not be able to satisfy the demand for mobile phones in that week only if
the demand of mobile phone is more than 5 phones. So, we need to find the value of
P (X > 5).
Also given that demand for mobile phone follows a Poisson distribution with mean 4.6
per week. i.e. λ = 4.6

P (X > 5) = 1 − P (X ≤ 5)
= 1 − [P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) + P (X = 5)]
 −4.6
e (4.6)0 e−4.6 (4.6)1 e−4.6 (4.6)2 e−4.6 (4.6)3 e−4.6 (4.6)4 e−4.6 (4.6)5

=1− + + + + +
0! 1! 2! 3! 4! 5!
−4.6
= 1 − e [1 + 4.6 + 10.58 + 16.22 + 18.66 + 17.16]
= 1 − 0.68
= 0.32

8. Suppose that in the end semester paper of Statistics there are 18 multiple-choice ques-
tions (only one option is correct for each question). Each question has 4 possible options.
You know the answer to 8 questions, but you have no idea about the other 10 questions
and choose answers randomly and independently. Your score X of the exam is the total
number of correct answers. Find the value of P (X ≥ 12). Enter your answer correct up
to 2 decimals accuracy.
Solution:
Since your score is the total number of correct answers and you know the answer to 8
questions.

4
So, instead of finding the value of P (X ≥ 12), define a new random variable Y and
find the value of P (Y ≥ 4) from the set of 10 questions for which you do not know the
answer.
Also there are four options to each question and only one is correct. That means prob-
ability of getting an answer correct is 1/4 and each question is independent of other.
So we can use binomial distribution with n = 10 and p = 0.25
Now,

P (Y ≥ 4) = 1 − P (Y < 4)
= 1 − [P (Y = 0) + P (Y = 1) + P (Y = 2) + P (Y = 3)]
h  1 0  3 10  1 1  3 9  1 2  3 8  1 3  3 7 i
= 1 − 10 C0 + 10 C1 + 10 C2 + 10 C3
4 4 4 4 4 4 4 4
 3 7 h 3 3  1 1  3 2  1 2  3 1  1 3 i
=1− + 10 + 45 + 120
4 4 4 4 4 4 4
 3 7 h 372 i
=1−
4 64
= 1 − 0.78
= 0.22

This implies that P (X ≥ 12) = 0.22.

9. A fruit owner sells fruit in a lot that contains 50 fruits. A customer selects 5 fruits at
random from a lot and rejects the lot (will not purchase) if one of the 5 selected fruits
is rotten. What is the probability that the customer will purchase the lot if there are 4
rotten fruits in the lot? Enter your answer correct up to 2 decimals accuracy.
Solution:
Given that there are 4 rotten fruits in the lot that contains 50 fruits.
Customer will purchase the lot if out of 5 selected fruits there is no rotten fruit.
Probability that there will not be any rotten fruit in 5 selected fruits will be
4
C0 46 C5 1370754
50 C
= = 0.6469
5 2118760
Accepted range: 0.61 - 0.67

10. Suppose the probability that any given person will independently believe a tale about
the existence of a parallel universe is 0.6. What is the probability that the eighth person
to hear this tale about existence of a parallel universe is the fifth one to believe it?

a) 8 C5 (0.6)5 (0.4)3
7
b) C4 (0.6)5 (0.4)3
c) 8 C5 (0.6)3 (0.4)5
d) 7 C4 (0.6)3 (0.4)5

5
Solution:
Given that the probability that any given person will believe a tale about the existence
of parallel universe is 0.6.
We need to find the probability that the eighth person to hear this tale about existence
of parallel universe is the fifth one to believe it.
We can put this into other words as out of 7 trials we need 4 successes and 8th trial also
a success.(Here success is considered as the probability that the person will believe the
tale about the existence of parallel universe)
Probability of getting 4 successes out of 7 will be 7 C4 (0.6)4 (0.4)3
Combining that 8th trial also, success will be 7 C4 (0.6)4 (0.4)3 × 0.6.
This implies that the probability that the eighth person to hear this tale about existence
of parallel universe is the fifth one to believe it is 7 C4 (0.6)5 (0.4)3 .

11. Suppose the number of visitors arriving at a zoo can be modeled to be Poisson dis-
tributed. On an average 20 visitors arrive per hour. Let X be the number of visitors
arriving from 2pm to 4pm. Then the probability that at least 35 visitors will arrive in
the given duration is
k=∞
P e−20 (20)k
a)
k=35 k!
k=34
P e−20 (20)k
b) 1 −
k=0 k!
k=∞
P e−40 (40)k
c)
k=35 k!
k=34
P e−40 (40)k
d) 1 −
k=0 k!

Solution:
Given that on an average 20 visitors arrive per hour and X is the number of visitors
arriving from 2pm to 4pm. So, here λ = 20 × 2 = 40
Now we have to find the probability that at least 35 visitors will arrive in the given
duration, that is from 2pm to 4pm.

P (X ≥ 35) = P (X = 35) + P (X = 36) + P (X = 37) + .....


k=∞
X e−λ (λ)k
=
k=35
k!
k=∞
X e−40 (40)k
=
k=35
k!

6
Also we can write

P (X ≥ 35) = 1 − P (X < 35)


= 1 − [P (X = 0) + P (X = 1) + P (X = 2) + .... + P (X = 34)]
k=34
X e−40 (40)k
=1−
k=0
k!

7
Statistics for Data Science - 2
Week 1 Graded Assignment
Multiple random variables

1. Joint distribution of two random variables X and Y is given as:

X
0 1
Y
1 1
1
4 8
1
2 k
4
1
3 0
8

Table 1.1.G: Joint distribution of X and Y .

Find the value of fY |X=1 (2).


Solution:
We know that
X
fXY (x, y) = 1
x∈TX ,y∈TY

1 1 1 1
⇒ + + +k+0+ =1
4 8 4 8
3 1
⇒k = 1 − =
4 4
Now,
fXY (1, 2)
fY |X=1 (2) =
fX (1)
fXY (1, 2)
=
fXY (1, 1) + fXY (1, 2) + fXY (1, 3)

1
4
= 1 1 1
8
+ + 4 8

1
4 1
= 1 =
2
2
2. Customers at a fast-food restaurant buy both sandwiches and drinks. The following
joint distribution summarizes the numbers of sandwiches (X) and drinks (Y ) purchased
by customers.

X
1 2
Y

1 0.4 0.2

2 0.1 0.25

3 0 0.05

Table 1.2.G: Joint distribution of X and Y .

Find the probability that a customer will buy two sandwiches given that he has bought
three drinks.
Solution:
X denotes the number of sandwiches purchased by a customer and Y denotes the num-
ber of drinks purchased by a customer.
To find: fX|Y =3 (2)

Now,
fXY (2, 3)
fX|Y =3 (2) = =
fY (3)
fXY (2, 3)
=
fXY (1, 3) + fXY (2, 3)

0.05
=
0 + 0.05
=1

3. A fair coin is tossed 4 times. Let X be the total number of heads and Y be the number
of heads before the first tail (If there is no tail in all the four tosses, then Y = 4). What
is the value of fY |X=2 (0)? [2 marks]
5
(a)
16
1
(b)
8
9
(c)
16

Page 2
1
(d)
2
Solution:
A fair coin is tossed four times. X denotes the number of heads and Y denotes the
number of heads before first tail (If there is no tail in all the four tosses, then Y = 4).
Clearly, X ∼ Binomial(4, 21 ).

Now,

fXY (2, 0)
fY |X=2 (0) =
fX (2)
fX|Y =0 (2).fY (0)
= ..(1)
fX (2)

Now, event Y = 0 shows that there is no head before first tail that is first outcome is
tail.
It implies that fY (0) = 21

fX|Y =0 (2) = P (two heads in the next three tosses)


 3
3 1
= C2
2

And 4
fX (2) = 4 C2 12
Putting the values in the equation (1), we get

1 3 1
3

C2 2
.2
fY |X=2 (0) = 4
1

4C
2 2
3 1
= =
6 2

4. Which of the following options is/are always correct?

(a) fXY Z (x, y, z) = fX|(Y =y,Z=z) (x).fY Z (y, z)


(b) fXY Z (x, y, z) = fX|(Y =y,Z=z) (x).fX (x)
P
(c) fX (x) = fXY (x, y) where RY is the range of Y .
y∈RY

(d) fXY (x, y) = fX (x).fY (y)

Page 3
Solution:
fXY Z (x, y, z)
We know that fX|(Y =y,Z=z) (x) =
fY Z (y, z)
⇒ fXY Z (x, y, z) = fX|(Y =y,Z=z) (x).fY Z (y, z)
Hence, option (a) is correct and option (b) is incorrect.

We know Pby the definition of marginal pmf that


fX (x) = fXY (x, y) where RY is the range of Y .
y∈RY

Hence, option (c) is correct.

fXY (x, y) = fX (x).fY (y) is true only when X and Y are independent. Therefore, option
(d) need not to be always true.

5. Two random variables X and Y are jointly distributed with joint pmf

fXY (x, y) = a(bx + y),

where x and y are integers in 0 ≤ x ≤ 2 and 0 ≤ y ≤ 3 such that P (X ≥ 1, Y ≤ 2) = 47 .


Find the value of fXY (2, 1). [2 marks]
1
1.
21
5
2. 42
1
3. 42
9
4. 42

Solution: We know that


X
fXY (x, y) = 1
x∈TX ,y∈TY

⇒fXY (0, 0) + fXY (0, 1) + fXY (0, 2) + fXY (0, 3) + fXY (1, 0) + fXY (1, 1) + fXY (1, 2)
+ fXY (1, 3) + fXY (2, 0) + fXY (2, 1) + fXY (2, 2) + fXY (2, 3) = 1

⇒a + 2a + 3a + ab + (ab + a) + (ab + 2a) + (ab + 3a)+


(2ab) + (2ab + a) + (2ab + 2a) + (2ab + 3a) = 1
⇒18a + 12ab = 1 ...(1)

Page 4
Now, using the given condition,
4
P (X ≥ 1, Y ≤ 2) =
7
⇒P (X = 1, Y = 0) + P (X = 1, Y = 1) + P (X = 1, Y = 2) + P (X = 2, Y = 0)+
4
P (X = 2, Y = 1) + P (X = 2, Y = 2) =
7
4
⇒ab + ab + a + ab + 2a + 2ab + 2ab + a + 2ab + 2a =
7
4
⇒6a + 9ab = ....(2)
7

Solving equation (1) and (2), we get


1 1
ab = and a =
21 42
It implies that
1
a= and b = 2
42
Therefore, the joint pmf of X and Y will be
1
fXY (x, y) = (2x + y)
42
1 5
Now, fXY (2, 1) = 42
(4 + 1) = 42
.

6. Akshat draws a card randomly from a well-shuffled pack of 52 cards. If the drawn card
is a face card, then he draws two balls randomly from bag A which contains 5 Red, 6
Black and 4 Green balls. If the drawn card is not a face card, then he draws three balls
randomly from bag B which contains 7 Red, 8 Black and 5 Green balls. Let two random
variables X and Y are defined as:
(
0 if the drawn card is a face card
X=
1 if the drawn card is not a face card

and Y be the number of Red balls drawn. Find the value of fY (1). Write your answer
correct up to two decimal places.
Solution:
Akshat draws a card randomly from a well-shuffled pack of 52 cards. Random variable
X is defined as (
0 if the drawn card is a face card
X=
1 if the drawn card is not a face card
If the drawn card is a face card, then he draws two balls randomly from bag A which
contains 5 Red, 6 Black and 4 Green balls. If the drawn card is not a face card, then
he draws three balls randomly from bag B which contains 7 Red, 8 Black and 5 Green

Page 5
balls. Random variable Y is the number of Red balls drawn.
To find: fY (1)
We know that

fY (1) = fXY (0, 1) + fXY (1, 1)


= fY |X=0 (1).fX (0) + fY |X=1 (1).fX (1)
5
C1 10 C1 12 7 C1 13 C2 40
= 15 . + 20 .
C2 52 C3 52
= 0.109 + 0.368 = 0.47

7. Three fair coins are tossed. If the first head occurs on the first toss, you score 1 point.
If the first head occurs on toss 2 or on toss 3, you score 2 or 3 points, respectively. If
no heads appear, you lose 1 point (that is score −1 point). Let X denote the number
of heads and Y denote the points scored. What is the probability that fewer than three
heads will occur and you will score 1 or less? Write your answer correct to two decimal
places.
Solution:
Given that X denotes the number of heads and Y denotes the point scored.
Clearly, TX = {0, 1, 2, 3} and TY = {−1, 1, 2, 3}.
To find: P (X < 3, Y ≤ 1).

Outcome X Y

HHH 3 1

HHT 2 1

HTH 2 1

THH 2 2

HTT 1 1

THT 1 2

TTH 1 3

TTT 0 −1

The outcomes HHT, HTH, HTT, TTT correspond to the event (X < 3, Y ≤ 1).

Page 6
Therefore,

P (X < 3, Y ≤ 1) = P ({HHT, HTH, HTT, TTT})


4 1
= =
8 2

8. Contracts for two construction jobs are each assigned uniformly at random to one or
more of three firms, A, B, and C. Let X denote the number of contracts assigned to firm
A and Y the number of contracts assigned to firm B. Find the value of fX|Y =0 (2). Write
your answer correct to two decimal places.
Solution:
Given that X denotes the number of contracts assigned to firm A and Y denotes the
number of contracts assigned to firm B.
Since each job is randomly assigned to one or more of the three firms, probability of
1
assigning one job to any of the three firms is . (Notice that one firm can be assigned
3
either 0 or 1 or 2 jobs).
Clearly, TX = TY = {0, 1, 2} Therefore,

P (X = 2, Y = 0) = P (Both the jobs are assigned to firm A)


1 1 1
= . =
3 3 9

Similarly,

P (X = 1, Y = 0) = P (one job is assigned to firm A and no job is assigned to firm B)


= P (one job is assigned to firm A and other job is assigned to firm C)
 
1 1 2
=2 . =
3 3 9

and

P (X = 0, Y = 0) = P (Both the jobs are assigned to firm C)


1 1 1
= . =
3 3 9

Therefore,

P (Y = 0) = P (X = 0, Y = 0) + P (X = 1, Y = 0) + P (X = 2, Y = 0)
1 2 1
= + +
9 9 9
4
=
9

Page 7
Now,

P (X = 2, Y = 0)
fX|Y =0 (2) =
P (Y = 0)
1/9
=4
/9
1
=
4

9. A fair coin is tossed five times, and the number of heads, N , is counted. The coin is
then tossed N more times. Find the probability that heads will appear for a total of
four times in this process. Write your answer correct to two decimal places.
Solution:
Given that N denotes the number of heads in five tosses of a coin.
Clearly, N ∼ Binomial(5, 1/2).

Let X denotes the number of heads in N tosses.


Then, X|(N = n) ∼ Binomial(n, 1/2)

Heads will appear a total of four times if (N = 2, X = 2), (N = 3, X = 1), (N = 4, X =


0).

It implies that

P (Total four heads will appear) = P (N = 2, X = 2) + P (N = 3, X = 1) + P (N = 4, X = 0)


= P (N = 2).P (X = 2|N = 2) + P (N = 3).P (X = 1|N = 3)
+ P (N = 4).P (X = 0|N = 4)
        
5 1 2 1 5 1 3 1 5 1 4 1
= C2 5 C2 2 + C3 5 C1 3 + C4 5 C0 4
2 2 2 2 2 2
10 30 5
= 7 + 8 + 9
2 2 2
1
= 7 [10 + 15 + 1.25]
2
= 0.20

10. From a group of three members of party A, two members of party B, and one member
of party C, a committee of two people is to be selected uniformly at random. Let X
denote the number of party A members and Y denote the number of party B members
on the committee. Find the value of fXY (1, 1).
Solution:

Page 8
Given that X denotes the number of party A members in selected two member’s commit-
tee and Y denotes the number of party B members in selected two member’s committee.

To find: fXY (1, 1)

fXY (1, 1) = P (X = 1, Y = 1)
3
C 1 .2 C 1
= 6
C2
3×2
=
15
= 0.4

Page 9
Statistics for Data Science - 2

Week 1 Practice Assignment Solution


Multiple random variables

1. Let X and Y be two random variables with joint distribution given in Table 3.1.P, where
a and b are two unknown values.

X
0 1 2
Y
1 3
0 a
12 12
2 1
1 b
12 12
3 1 1
2
12 12 12

Table 3.1.P: Joint distribution of X and Y .

i) Find P (Y = 1).
4
a)
12
3
b)
12
5
c)
12
1
d)
12
Solution: P
We know that, fXY (x, y) = 1
x∈TX , y∈TY
1 3 2 1 3 11
⇒ 12 + 12 +a + 12 + b + 12 + 12 + 12
+ 12 =1
⇒a+b=0
Since a and b cannot take negative values ⇒ a = b = 0.

1
Now,
X
P (Y = 1) = fXY (x, 1)
x∈TX
2 1
= +b+
12 12
3
= +0
12
3
=
12

ii) Find P (Y = 1 | X = 2).


1
a)
12
1
b)
4
1
c)
3
1
d)
2
Solution:

P (Y = 1, X = 2)
P (Y = 1 | X = 2) =
P (X = 2)
1
= 12
1 1
a+ +
12 12
1
=
2

iii) Find P (X = 0, Y ≥ 1).


4
a)
12
3
b)
12
5
c)
12
1
d)
12

2
Solution:

P (X = 0, Y ≥ 1) = P (X = 0, Y = 1) + P (X = 0, Y = 2)
2 3
= +
12 12
5
=
12
2. Let X ∼ Uniform({1, 2, 3, 4, 5, 6}) and let Y be the number of times 2 occurs in X throws
of a fair die. Choose the incorrect option(s) among the following.
1
a) P (Y = 2 | X = 2) =
6
52
b) P (Y = 2 | X = 4) = 3
6
5
c) P (Y = 5 | X = 6) = 5
6
5
d) P (Y = 6 | X = 5) = 6
6
Solution:

P (Y = 2 | X = 2) ∼ Bin(2, 1/6), Y takes values in{0, 1, 2}


1 5
= 2 C2 ( )2 ( )0
6 6
1
=
36

P (Y = 2 | X = 4) ∼ Bin(4, 1/6)
1 5
= 4 C 2 ( )2 ( )2
6 6
52
= 3
6

P (Y = 5 | X = 6) ∼ Bin(6, 1/6)
1 5
= 6 C 5 ( )5 ( )1
6 6
5
= 5
6

P (Y = 6 | X = 5) ∼ Bin(5, 1/6)
=0

3
3. Let the random variables X and Y each have range {1, 2, 3}. The following formula
gives the joint PMF
i + 2j
P (X = i, Y = j) = ,
c
where c is an unknown value. Find P (1 ≤ X ≤ 3, 1 < Y ≤ 3).
5
a) 9
7
b) 9
2
c) 9
4
d) 9

Solution: P
We know that, P (X = x, Y = y) = 1
x∈TX , y∈TY
⇒ P (X = 1, Y = 1) + P (X = 1, Y = 2) + P (X = 1, Y = 3) + P (X = 2, Y = 1) + P (X =
2, Y = 2) + P (X = 2, Y = 3) + P (X = 3, Y = 1) + P (X = 3, Y = 2) + P (X = 3, Y =
3) = 1
⇒ 3c + 5c + 7c + 4c + 6c + 8c + 5c + 7c + 9
c
=1
⇒ c = 54
Now,
P (1 ≤ X ≤ 3, 1 < Y ≤ 3) = P (X = 1, Y = 2) + P (X = 1, Y = 3) + P (X = 2, Y = 2)
+ P (X = 2, Y = 3) + P (X = 3, Y = 2) + P (X = 3, Y = 3)

1
⇒ P (1 ≤ X ≤ 3, 1 < Y ≤ 3) = [5 + 7 + 6 + 8 + 7 + 9]
c
42
=
54
7
=
9
4. The joint PMF of the random variables X and Y is given in Table 1.2.P.

X
1 2 3
Y

1 k k 2k

2 2k 0 4k

3 3k k 6k

Table 1.2.P: Joint distribution of X and Y .

4
Consider the random variable Z = X 2 Y.
i) Find the range of Z | Y = 2.

a) {1, 4, 9}
b) {4, 8, 18}
c) {1, 9}
d) {2, 18}
e) {2, 8, 18}

Solution: P
We know that, P (X = x, Y = y) = 1
x∈TX , y∈TY
⇒ k + k + 2k + 2k + 0 + 4k + 3k + k + 6k = 1
1
⇒ k = 20
When Y = 2, P (X = 2, Y = 2) = 0. So for the range we will not consider the pair (2,
2).
Since Z = X 2 Y , the range of Z | Y = 2 will be {12 × 2, 32 × 2} which is equal to {2, 18}.
ii) Find the value of P (Z = 18 | Y = 2).
1
a) 3
2
b) 3
3
c) 4
1
d) 4

Solution:

P (Z = 18, Y = 2)
P (Z = 18 | Y = 2) =
P (Y = 2)
P (X = 3, Y = 2)
=
P (X = 1, Y = 2) + P (X = 3, Y = 2)
4k
=
2k + 4k
2
=
3

5. From a sack of fruits containing 3 mangoes, 2 kiwis, and 3 guavas, a random sample of
4 pieces of fruit is selected. If X is the number of mangoes and Y is the number of kiwis
in the sample, then find the joint probability distribution of X and Y .

5
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 2 18
1
70 70 70 70
3 9 3
2 0
70 70 70

a)

X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 18 2
1
70 70 70 70
3 9 3
2 0
70 70 70

b)

X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 18 2
1
70 70 70 70
9 3 3
2 0
70 70 70

c)

X
0 1 2 3
Y
3 3 9
0 0
70 70 70
2 18 18 2
1
70 70 70 70
3 9 3
2 0
70 70 70

6
d)

Solution:
X is the number of mangoes and Y is the number of kiwis in the sample. The number
of mangoes and kiwis in the sack is 3 and 2,respectively.
So X will take values in {0, 1, 2, 3} and Y will take values in {0, 1, 2} when the random
sample of 4 pieces is selected.
P (X = 0, Y = 0) = P (no mango and no kiwi) = 0 (not possible since the number of
guava is 3)
2
C1 3 C3 2
P (X = 0, Y = 1) = P (no mango and one kiwi) = 8 =
C4 70
2
C2 3 C2 3
P (X = 0, Y = 2) = P (no mango and two kiwis) = 8C
=
4 70
3
C1 3 C3 3
P (X = 1, Y = 0) = P (one mango and no kiwi) = 8C
=
4 70
3
C1 2 C1 3 C2 18
P (X = 1, Y = 1) = P (one mango and one kiwi) = 8C
=
4 70
3
C1 2 C2 3 C1 9
P (X = 1, Y = 2) = P (one mango and two kiwis) = 8C
=
4 70
3
C2 3 C2 9
P (X = 2, Y = 0) = P (two mangoes and no kiwi) = 8C
=
4 70
3
C2 2 C1 3 C1 18
P (X = 2, Y = 1) = P (two mangoes and one kiwi) = 8C
=
4 70
3
C2 2 C2 3
P (X = 2, Y = 2) = P (two mangoes and two kiwis) = 8C
=
4 70
Similarly you can check for other values also.
Answer: b

6. Suppose you flip a fair coin. If the coin lands heads, you roll a fair six-sided die 50 times.
If the coin lands tails, you roll the die 51 times. Let X be 1 if the coin lands heads and
0 if the coin lands tails. Let Y be the total number of times you get the number 5 while
throwing the dice. Find P (X = 1|Y = 10).
85
a)
157
82
b)
167
72
c)
157

7
85
d)
167
Solution:

P (X = 1|Y = 10) P (Y = 10|X = 1).P (X = 1)


=
P (X = 0|Y = 10) P (Y = 10|X = 0).P (X = 0)
P (Y = 10|X = 1)
= [Since P (X = 1) = P (X = 0)]
P (Y = 10|X = 0)
50 1 5
C10 ( )10 ( )40
= 6 6
1 5
51 C ( )10 ( )41
10
6 6
50
C10 6
= 51 C
×
10 5
41 6
= ×
51 5
246
=
255

246
⇒ P (X = 1|Y = 10) = × P (X = 0|Y = 10)
255

Also P (X = 1|Y = 10) + P (X = 0|Y = 10) = 1

255
⇒ P (X = 1|Y = 10) + P (X = 1|Y = 10) = 1
246
246 82
⇒ P (X = 1|Y = 10) = =
501 167
7. Three balls are selected at random from a box containing five red, four blue, three yellow
and six green coloured balls. If X, Y and Z are the number of red balls, blue balls and
green balls respectively, choose the correct option(s) among the following.
25
a) P (X = 1, Y = 0, Z = 2) =
272
5
b) P (X = 1, Y = 1, Z = 1) =
34
1
c) P (X = 1, Y = 0 | Z = 2) =
4
5
d) P (X = 0, Y = 0, Z = 3) =
204

8
Solution:
5
C1 6 C2 25
P (X = 1, Y = 0, Z = 2) = P (one red ball and 2 green balls) = 18 C
=
3 272
5
C1 4 C1 6 C1
P (X = 1, Y = 1, Z = 1) = P (one red ball, one blue ball and 1 green ball) = 18 C
3
5
=
34
6
C3 5
P (X = 0, Y = 0, Z = 3) = P (3 green balls) = 18 C
=
3 204

And
5
C1
P (X = 1, Y = 0 | Z = 2) = P (one red ball given that two balls are green) = 16 C
1
5
=
16
8. A computer system receives messages over three communications lines. Let Xi be the
number of messages received on line i in one hour. Suppose that the joint pmf of X1 , X2 ,
and X3 is given by
fX1 X2 X3 (x1 , x2 , x3 ) = (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2 px3 3 for x1 ≥ 0, x2 ≥ 0, x3 ≥ 0 and
0 < pi < 1.
i) Find fX1 X2 (x1 , x2 ).
a) (1 − p1 )(1 − p2 )
b) (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2
c) (1 − p1 )(1 − p2 )px1 1 px2 2
d) px1 1 px2 2
Solution:


X
fX1 X2 (x1 , x2 ) = fX1 X2 X3 (x1 , x2 , x3 )
x3 =0
X∞
= (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2 px3 3
x3 =0

= (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2 [1 + p3 + p23 + . . .]


 
x1 x2 1
= (1 − p1 )(1 − p2 )(1 − p3 )p1 p2
1 − p3
x1 x2
= (1 − p1 )(1 − p2 )p1 p2

ii) Find fX2 (x2 ).

9
a) (1 − p1 )
b) (1 − p1 )(1 − p2 )px2 2
c) (1 − p1 )px1 1
d) (1 − p2 )px2 2
Solution:


X
fX2 (x2 ) = fX1 X2 (x1 , x2 )
x1 =0
X∞
= (1 − p1 )(1 − p2 )px1 1 px2 2
x1 =0

= (1 − p1 )(1 − p2 )px2 2 [1 + p1 + p21 + . . .]


 
x2 1
= (1 − p1 )(1 − p2 )p2
1 − p1
x2
= (1 − p2 )p2

iii) Find P (X1 = 2, X3 = 5).


a) p21 p53
b) (1 − p1 )(1 − p3 )p21 p53
c) (1 − p1 )(1 − p2 )p21 p52
d) (1 − p1 )(1 − p3 )
Solution:
We have fX1 X2 (x1 , x2 ) = (1 − p1 )(1 − p2 )px1 1 px2 2
Similarly, we can find fX1 X3 (x1 , x3 ).
And fX1 X3 (x1 , x3 ) = (1 − p1 )(1 − p3 )px1 1 px3 3
Therefore P (X1 = 2, X3 = 5) = fX1 X3 (2, 5) = (1 − p1 )(1 − p3 )p21 p53
9. A coin is tossed twice. Let X denote the number of heads on the first toss and Y denote
the total number of heads on the 2 tosses. If the coin is biased and a head has a 40%
chance of occurring,
i) Find P (Y = 2). Enter your answer correct to two decimals accuracy.
Solution:
X denote the number of heads on the first toss and Y denote the total number of heads
on the 2 tosses. This means X will take values in {0, 1} and Y will take values in {0, 1, 2}.

P (Y = 2) = P (Y = 2 | X = 0)P (X = 0) + P (Y = 2 | X = 1)P (X = 1)
= 0 + 0.4 × 0.4
= 0.16

10
ii) Find P (X = 1). Enter your answer correct to two decimals accuracy.

P (X = 1) = 0.40

iii) Find P (X = 1, Y = 1). Enter your answer correct to two decimals accuracy.

P (X = 1, Y = 1) = P (Y = 1 | X = 1)P (X = 1)
= 0.6 × 0.4
= 0.24

10. Let X1 , X2 , X3 ∼ fX1 X2 X3 where Xi ∈ {−1, 1} for each i. If fX1 X2 X3 (−1, −1, 1) =
1
8
, fX2 (1) = 16 , fX3 |X2 =−1 (1) = 51 , find fX1 |X2 =−1,X3 =1 (−1). Enter your answer correct
to two decimals accuracy.
Solution:

fX1 X2 X3 (−1, −1, 1)


fX1 |X2 =−1,X3 =1 (−1) =
fX2 X3 (−1, 1)
fX1 X2 X3 (−1, −1, 1)
=
fX3 |X2 =−1 (1)fX2 (−1)
fX1 X2 X3 (−1, −1, 1)
=
fX3 |X2 =−1 (1)(1 − fX2 (1))
1
8
= 1
5
(1 − 16 )
6
=
8
= 0.75

11. Suppose that the number of people who visit a yoga academy each day is a Poisson
random variable with mean 30. Suppose further that each person who visits is, indepen-
dently, a girl with probability 0.5 or a boy with probability 0.5. Find the joint probability
that exactly 10 boys and 15 girls visit the yoga academy on any given day.
e−30 3025
a)
15!10!
−15
e 3025
b)
15!10!
−8
e 1525
c)
15!10!
e−30 1525
d)
15!10!

11
Solution:
Let X denote the number of boys who visits on a particular day.
Let Y denote the number of girls who visits on a particular day.
Let Z = X + Y be the total number of people who visits.
Observe that if x + y 6= z ⇒ P (X = x, Y = y | Z = z) = 0
Therefore

P (X = 10, Y = 15) = P (X = 10, Y = 15 | Z = 25)P (Z = 25)


e−30 3025
= P (X = 10, Y = 15 | Z = 25)
25!
Consider P (X = 10, Y = 15 | Z = 25) as Binomial(25, p) where p = P(success) and
success is that a boy visits the yoga academy.
Therefore
e−30 3025
P (X = 10, Y = 15) = 25 C10 (0.5)10 (0.5)15
25!
−30 25
25! e 30
= (0.5)25
10!15! 25!
−30 25
e 15
=
15!10!

12
Statistics for Data Science - 2
Week 2 Graded Assignment
Multiple random variables

1. Let X and Y be two independent discrete random variables with CDFs FX and FY ,
respectively. Define another random variable Z = min(X, Y ), then the CDF of Z is

a) min(FX , FY )
b) FX FY
c) FX + FY + FX FY
d) FX + FY − FX FY

Solution:

FZ (z) = P (Z ≤ z) = P (min(X, Y ) ≤ z)
= 1 − P (min(X, Y ) > z)
= 1 − P (X > z, Y > z)

Since X and Y are two independent discrete random variables,


P (X > z, Y > z) = P (X > z)P (Y > z)

FZ (z) = 1 − P (X > z)P (Y > z)


= 1 − [(1 − P (X ≤ z))(1 − P (Y ≤ z))]
= 1 − [(1 − FX (z))(1 − FY (z))]
= FX (z) + FY (z) − FX (z)FY (z)

2. Let X and Y be two independent random variables with PMFs


(
1
for k = 1, 2, 3, 4, 5, 6.
fX (k) = fY (k) = 6
0 otherwise

Define Z = X − Y . Find the value of fZ (3).


4
a)
12
3
b)
12
5
c)
12
1
d)
12
Solution:

fZ (3) = P (Z = 3) = P (X − Y = 3)
= P (X = 4, Y = 1) + P (X = 5, Y = 2) + P (X = 6, Y = 3)

Given that X and Y are two independent random variables.


⇒ P (X = x, Y = y) = P (X = x)P (Y = y) for all (x, y).

fZ (3) = P (X = 4, Y = 1) + P (X = 5, Y = 2) + P (X = 6, Y = 3)
= P (X = 4)P (Y = 1) + P (X = 5)P (Y = 2) + P (X = 6)P (Y = 3)
1 1 1 1 1 1
= × + × + ×
6 6 6 6 6 6
3
=
36
1
=
12

3. Let X ∼ Geometric(p) and Y ∼ Geometric(p) be independent and let Z = X + Y .


Determine the values of p for which P (Z = 26) > P (Z = 25).

a) p > 0.02
b) p < 0.04
c) p > 0.15
d) p < 0.30
e) p = 0.05

Solution:
If X ∼ Geometric(p) and Y ∼ Geometric(p) are two independent random variables and
Z = X + Y , then

P (Z = n) = (n − 1)p2 (1 − p)n−2
(Try derivation yourself)
We have to find the value of p for which P (Z = 26) > P (Z = 25).
P (Z = 26) = (26 − 1)p2 (1 − p)26−2
and P (Z = 25) = (25 − 1)p2 (1 − p)25−2

Comparing both, we will get


25p2 (1 − p)24 > 24p2 (1 − p)23

Page 2
⇒ 25(1 − p) > 24
⇒ 1 − p > 24
25
⇒ p < 0.04

4. The following options gives the joint PMF of the random variables X and Y . If the
random variables X and Y are independent, then which of the following option(s) can
be the joint PMF of X and Y ?

Y
0 1 2
X

0 0.01 0 0

1 0.09 0.09 0

2 0 0 0.81

a)

Y
0 1 2
X

0 0.06 0.18 0.12

1 0.04 0.12 0.48

b)

Y
0 1 2
X
1 1 1
0
12 24 24
1 1 1
1
6 12 8
1 1 1
2
4 8 12

c)

Page 3
Y
0 1 2
X
1 1 1
0
10 5 5
1 1 3
1
10 10 10

d)

Y
0 1
X

0 0.10 0.15

1 0.20 0.30

2 0.10 0.15

e)

Answer: e

Solution:
In option a)
P (X = 0, Y = 1) = 0 but P (X = 0) = 0.01+0+0 = 0.01 and P (Y = 1) = 0+0.09+0 =
0.09
⇒ P (X = 0, Y = 1) 6= P (X = 0)P (Y = 1)
Therefore, option (a) cannot be the joint PMF of X and Y.

In option b)
P (X = 0, Y = 0) = 0.06 but P (X = 0) = 0.06 + 0.18 + 0.12 = 0.36 and P (Y = 0) =
0.06 + 0.04 = 0.10
⇒ P (X = 0, Y = 0) = 0.06 6= 0.036 = P (X = 0)P (Y = 0)
Therefore, option (b) cannot be the joint PMF of X and Y.

In option c)
P (X = 1, Y = 0) = 1/6 but P (X = 1) = 1/6 + 1/12 + 1/8 = 3/8 and P (Y = 0) =
1/12 + 1/6 + 1/4 = 1/2
⇒ P (X = 1, Y = 0) = 1/6 6= 3/16 = P (X = 1)P (Y = 0)
Therefore, option (c) cannot be the joint PMF of X and Y.

Page 4
In option d)
P (X = 0, Y = 1) = 1/5 but P (X = 0) = 1/10 + 1/5 + 1/5 = 1/2 and P (Y = 1) =
1/5 + 1/10 = 3/10
⇒ P (X = 0, Y = 1) = 1/5 6= 3/20 = P (X = 0)P (Y = 1)
Therefore, option (d) cannot be the joint PMF of X and Y.

In option e)
For every (x, y), P (X = x, Y = y) = P (X = x)P (Y = y) (check yourself)
Hence option (e) is the joint PMF of X and Y.

5. Let X and Y be two independent random variables such that X ∼ Bernoulli(0.2) and
Y ∼ Bernoulli(0.4). Let another random variable Z be defined as Z = X + Y . Find
the value of fX|Z=1 (1). Enter the answer correct to two decimal places.
Answer: [0.25, 0.29]

Solution:
X ∼ Bernoulli(0.2)
Y ∼ Bernoulli(0.4)
Z =X +Y

P (X = 1, Z = 1)
fX|Z=1 (1) =
P (Z = 1)
P (X = 1, Y = 0)
=
P (X = 1, Y = 0) + P (X = 0, Y = 1)
P (X = 1)P (Y = 0)
=
P (X = 1)P (Y = 0) + P (X = 0)P (Y = 1)
0.2 × 0.6
=
(0.2 × 0.6) + (0.8 × 0.4)
=0.28

6. Let X1 , X2 and X3 be three independent and identically distributed Poisson random


variables with λi = 3 for all i. Find the probability that exactly one of the Xi equals 0
and exactly one of the Xi equals 1?

(a) 18e−6 (1 − 4e−3 )


(b) 9e−6 (1 − 4e−3 )
(c) 3e−6 (1 − 4e−3 )
(d) 12e−6 (1 − 4e−3 )

Page 5
Solution:
First we will find the probability such that X1 = 0, X2 = 1 and X3 do not take values 0
and 1.
Since all the four random variable are independent, we have
P (X1 = 0, X2 = 1, X3 6= {0, 1}) = P (X1 = 0)P (X2 = 1)P (X3 6= {0, 1})
Now, P (X1 = 0) = e−3 , P (X1 = 1) = 3e−3
P (X3 6= {0, 1}) = 1 − P (X3 = {0, 1})
= 1 − [P (X3 = 0) + P (X3 = 1)]
= 1 − 4e−3
Now, P (X1 = 0, X2 = 1, X3 6= {0, 1}) = e−3 3e−3 (1 − 4e−3 ) = 3e−6 (1 − 4e−3 )
We can choose such pairs of Xi in 3! ways.
Therefore, probability that exactlyXi equals 0 and exactly one Xi equals 1 is given by
6 × 3e−6 (1 − 4e−3 ) = 18e−6 (1 − 4e−3 ) ways.
7. Let X ∼ Bernoulli(0.3) and Y ∼ Bernoulli(0.5) be independent. Define Z = X+Y −XY ,
find the distribution of Z.
(a) Z ∼ Bernoulli(0.35)
(b) Z ∼ Bernoulli(0.65)
(c) Z ∼ Bernoulli(0.15)
(d) Z ∼ Bernoulli(0.85)
Solution:
Given X ∼ Bernoulli(0.3) and Y ∼ Bernoulli(0.5) are independent.
Z = X + Y − XY
Since X ∼ Bernoulli(0.3), it will take the values in {0, 1}.
Similarly Y will take values in {0, 1}.
Also X and Y are given to be independent, therefore fXY (x, y) = fX (x)fY (y) for all
(x, y).
Consider the following joint distribution table of X and Y .

X Y X + Y − XY fXY (x, y)
1 1 1 (0.3)(0.5) = 0.15
1 0 1 (0.3)(0.5) = 0.15
0 1 1 (0.7)(0.5) = 0.35
0 0 0 (0.7)(0.5) = 0.35

The range of Z is {0, 1}.


P (Z = 0) = 0.35, P (Z = 1) = 0.65
Therefore, Z ∼ Bernoulli(0.65).

Page 6
8. Let X and Y be independent and identically distributed Geometric random variables
with parameter 0.6.

(a) Find P (X = 2 | X + Y = 11). Enter the answer correct to one decimal place.
Answer: 0.1

Solution:
Given X, Y ∼ i.i.d. Geometric(0.6).

P (X = 2, X + Y = 11)
P (X = 2 | X + Y = 11) =
P (X + Y = 11)
P (X = 2)P (Y = 9)
= 10
P
P (X = i, Y = 11 − i)
i=1
(0.4)(0.6) × (0.4)8 (0.6)
= 10
P
P (X = i)P (Y = 11 − i)
i=1
(0.4)9 (0.6)2
= = 0.1
10(0.4)9 (0.6)2

(b) Find P (X = 7 | X + Y = 11). Enter the answer correct to one decimal place.
Answer: 0.1

Solution:
This can also be solved using the above steps.
1
Remark: For any i, P (X = i | X + Y = n) = , for i : 1 → n − 1.
n−1
9. The joint distribution of X and Y is given by
9
fXY (x, y) = ,
16 × 4x+y
where TX , TY ∈ {0, 1, 2, . . .}.

(i) Find the probability mass function of X + Y .


9
(a) k
16 · 4k
9
(b) (k + 1)
16 · 4k
9
(c) (k + 1)
16 · 4k+1
9
(d) k
16 · 4k+1

Page 7
Solution:
X, Y is in the range {0, 1, 2, . . .}.
The range of X + Y is {0, 1, 2, . . .}.

X +Y fX+Y (k)
9
0
16
9
1 2
16 × 4
9
2 3
16 × 42
.. ..
. .
9
k (k + 1)
16 · 4k

9
Therefore, P (X + Y = k) = (k + 1)
16 · 4k
(ii) Find the probability mass function of Z = max{X, Y }.
3(4k − 1)
(a) fZ (k) = for k = 1, 2, . . .
2 · 42k
3(4k − 1)
(b) fZ (k) = for k = 0, 1, . . .
2 · 42k
9 3(4k − 1)
(c) fZ (k) = + for k = 0, 1, . . .
16 · 42k 2 · 42k
9 3(4k − 1)
(d) fZ (k) = + for k = 1, 2, . . .
16 · 42k 2 · 42k
Solution:
The joint distribution of X and Y is given by
9
fXY (x, y) =
16 × 4x+y

Let Z = max{X, Y }.
Clearly, range of Z will be TZ = {0, 1, 2, . . .}.

Now, choose k ∈ TZ . Then

Page 8
Z fZ (k)
9
0
16
9 9
1 2 +
16 · 4 16 · 42
9 9 9
2 2 + 2 +
16 · 42 16 · 43 16 · 44
.. ..
.  . 
9 1 1 1 9
k 2 k
1 + + 2 + . . . + (k−1) +
16 · 4 4 4 4 16 · 42k

Now,
 
9 1 1 1 9
fZ (k) =2 1 + + 2 + . . . + (k−1) +
16 · 4k 4 4 4 16 · 42k
  k 
1
1−
9   4  9
=2 +

k
16 · 4 
 1  16 · 42k

1−
4
9 3(4k − 1)
= +
16 · 42k 2 · 42k
9 3(4k − 1)
Therefore, fZ (k) = + for k = 0, 1, . . .
16 · 42k 2 · 42k
10. Let the random variables X and Y , which represent the number of calls received by call
centers A and B, respectively, in a one-hour interval follow the Poisson distribution. The
average number of calls received in call centers A and B is 2 per hour and 3 per hour,
respectively. Assume that X and Y are independent. If Z denotes the total number of
calls received in call centers A and B, find the conditional probability fY |Z=5 (3). Enter
the answer correct to two decimal places.
Answer: [0.33, 0.36]

Solution:
Given, X and Y denote the number of calls received by call center A and B in a one-hour
interval.
Also, X ∼ Poisson(2) and Y ∼ Poisson(3) are independent.

Z = X + Y represents the total calls received in a one-hour interval.


 
3
Now, Y | Z = 5 ∼ Binomial 5,
2+3
 3  2
3 2
Therefore, fY |Z=5 (3) = 5C 3 = 0.35
5 5

Page 9
Statistics for Data Science - 2
Week 2 Practice Assignment
Multiple random variables

1. Consider an experiment of tossing a fair coin twice. Let X be the number of heads that
occurs in the two tosses and Y be the number of tails that occurs in the two tosses.
Based on the given information, choose the correct statements.

(a) X and Y are independent random variables.


(b) X and Y are dependent random variables.
1
(c) fXY (1, 1) = .
2
1
(d) fY |X=0 (1) = .
4
Solution:
X denotes the number of heads that occurs in the two tosses and Y denotes the number
of tails that occurs in the two tosses.
First we will make the table of the joint pmf of X and Y .

X
0 1 2
Y
1
0 0 0 4

1
1 0 2
0

1
2 4
0 0

Joint pmf of X and Y .

From the table, we have


1 1
fX (0) = 0 + 0 + =
4 4
1 1
fY (0) = 0 + 0 + =
4 4
and
fXY (0, 0) = 0

It is clear that
fXY (0, 0) 6= fX (0).fY (0)
It implies that X and Y are dependent random variables.
So, option (a) is incorrect and option (b) is correct.

Now, from table


1
fXY (1, 1) =
2
So, option (c) is correct.

fXY (0, 1)
fY |X=0 (1) = = 0 (Since, fXY (0, 1) = 0)
fX (0)
So, option (d) is incorrect.

2. Two fair dice are thrown simultaneously. Let X be the outcome on the first die and Y
be the sum of the outcomes on both the dice. Find the value of P (Y − X ≥ 6).
1
(a)
6
1
(b)
12
5
(c)
12
1
(d)
24
Solution:
X denotes the outcome on the first die and Y denotes the sum of the outcomes on both
the dice.
Notice that Y − X will denote the outcome on the second die.
Let Z = Y − X, then Z ∼ Uniform({1, 2, 3, 4, 5, 6})

P (Y − X ≥ 6) = P (Z ≥ 6)
P (Y − X ≥ 6) = P (Z = 6)
1
P (Y − X ≥ 6) =
6

3. Let X and Y denote the number of cars and number of bikes reaching a street corner
during a certain 15-minute time period, respectively. Joint distribution of X and Y is
given as
9
fXY (x, y) =
16(4x+y )
Choose the correct option(s).
3
(a) Marginal pmf of X is fX (x) = .
4x+1

Page 2
3
(b) Marginal pmf of X is fX (x) = .
4x
(c) X and Y are independent random variables.
(d) X and Y are dependent random variables.

Solution:
X and Y denote the number of cars and number of bikes reaching a street corner during
a certain 15-minute time period, respectively.
Range of X and Y will be TX , TY = {0, 1, 2, ..., ∞}

Joint distribution of X and Y is given as


9
fXY (x, y) =
16(4x+y )

Now,

X
fX (x) = fXY (x, y)
y=0

X 9
=
y=0
16(4x+y )

9 X 1
=
16.4x y=0 4y
 
9 1 1
= 1 + + 2 + ...
16.4x 4 4
 
9 1
=
16.4x 1 − 14
 
9 4
=
42 .4x 3
3
= x+1
4

Therefore, option (a) is correct and option (b) is incorrect.

Similarly, we can show that


3
fY (y) = y+1
4

Now, Choose two arbitrary points x and y in the range of X and Y , respectively, then

Page 3
3 3
fX (x).fY (y) = .
4x+1 4y+1
9
⇒ fX (x).fY (y) =
16(4x+y )
⇒ fX (x).fY (y) = fXY (x, y)

Hence, X and Y are independent random variables.


Therefore, option (c) is correct and option (d) is incorrect.

4. Let X1 , X2 , X3 and X4 be four independent and identically distributed Poisson random


variables with λi = 4 for all i. Find the probability that exactly one of the Xi equals 0
and exactly one of the Xi equals 1?

(a) 24e−8 (1 − 25e−8 )


(b) 24e−8 (1 − 5e−4 )2
(c) 48e−8 (1 − e−8 )
(d) 48e−8 (1 − 5e−4 )2

Solution:
First we will find the probability such that X1 = 0, X2 = 1 and other two random
variables do not take value 0 and 1.

Since all four random variable are independent, we have

P (X1 = 0, X2 = 1, X3 6= {0, 1}), X4 6= {0, 1}) =


P (X1 = 0).P (X2 = 1).P (X3 6= {0, 1})P (X4 6= {0, 1})....(1)

Now,
e−4 40
P (X1 = 0) = = e−4
0!
e−4 41
P (X2 = 1) = = 4e−4
1!

P (X3 6= {0, 1}) = 1 − P (X3 = {0, 1})


= 1 − [P (X3 = 0) + P (X3 = 1)]
= 1 − [e−4 + 4e−4 ] = 1 − 5e−4

P (X4 6= {0, 1}) = 1 − P (X4 = {0, 1})


. = 1 − [P (X4 = 0) + P (X4 = 1)]

Page 4
. = 1 − [e−4 + 4e−4 ] = 1 − 5e−4

Putting all these values in equation (1), we get

P (X1 = 0, X2 = 1, X3 6= {0, 1}), X4 6= {0, 1}) = e−4 (4e−4 )(1 − 5e−4 )2

We can choose such pairs of Xi for which exactly one Xi equals 0 and exactly one Xi
equals 1 in 4P2 ways.
Therefore,
probability that exactly one of the Xi equals 0 and exactly one of the Xi equals 1 is
given by
4
P2 e−4 (4e−4 )(1 − 5e−4 )2
= 48e−8 (1 − 5e−4 )2

5. A person tosses a fair coin until it shows a head and rolls a fair die until it shows the
number six. Assume that tossing the coin is independent of rolling the die. What is the
probability that number of tosses to get first head is equal to number of rolls to get the
first six? Write your answer correct to two decimal places.
Solution
Let X denote the number of tosses of the coin to get the first head and let Y denote the
number of rolls of the die to get the first six.
1 1
Clearly, X ∼ Geometric( ) and Y ∼ Geometric( ).
2 6

Number of tosses to get first head will be equal to number of rolls to get the first six if

Page 5
(X = 1, Y = 1) or (X = 2, Y = 2) or (X = 3, Y = 3) or so on. Therefore,

P (X = Y ) = P (X = 1, Y = 1) + P (X = 2, Y = 2) + P (X = 3, Y = 3) + . . .
X∞
= P (X = i, Y = i)
i=1

X
= P (X = i)P (Y = i)
i=1
∞  i  i−1
X 1 5 1
=
i=1
2 6 6
∞  i−1  i−1
1 X 1 5
=
12 i=1 2 6
∞   i−1
1 X 5
=
12 i=1 12
"  2 #
1 5 5
= 1+ + + ...
12 12 12
 
1 1
=
12 1 − 5/12
1
=
7

6. Let X and Y be i.i.d. Geometric(p), where p ∈ [0, 1] is a constant. Find the value of
P (X = 6|X + Y = 10). Write your answer correct to two decimal places.
Solution:
Given that X and Y are i.i.d. Geometric(p).
To find: P (X = 6|X + Y = 10).

P (X = 6, X + Y = 10)
P (X = 6|X + Y = 10) =
P (X + Y = 10)
P (X = 6, Y = 4)
=
P (X + Y = 10)
P (X = 6)P (Y = 4)
= ...(1)
P (X + Y = 10)

Page 6
Now,
9
X
P (X + Y = 10) = P (X = i, Y = 10 − i)
i=1
X9
= P (X = i)P (Y = 10 − i)
i=1
9
X
= (1 − p)i−1 p.(1 − p)9−i p
i=1
9
X
= p2 (1 − p)8
i=1
= 9p2 (1 − p)8 ...(2)

From equation (1) and (2), we have


(1 − p)5 p.(1 − p)3 p
P (X = 6|X + Y = 10) =
9p2 (1 − p)8
(1 − p)8 p2
= 2
9p (1 − p)8
1
=
9

7. Two dice are rolled simultaneously. Let X denote the greatest outcome (if both the
outcomes are same, then X will be that outcome) and Y denote the lowest outcome (if
both the outcomes are same, then Y will be that outcome). Choose the correct options.
(a) X is uniformly distributed.
(b) Y is uniformly distributed.
(c) X and Y are independent random variables.
(d) X and Y are not independent.
(e) Joint distribution of X and Y is Uniform{1, 2, 3, 4, 5, 6}.
Solution:
Given that X denotes the greater outcome and Y denotes the lower outcome.
Clearly, TX = TY = {1, 2, . . . , 6}.

First, we will find the joint distribution of X and Y .


Note that if x < y, fXY (x, y) = 0.

Page 7
If x > y, fXY (x, y) = P (any one of the two outcomes is x and another one is y)
1 1
=2× =
36 18
If x = y, fXY (x, y) = P (both the outcomes are same (x = y) )
1
=
36

X
1 2 3 4 5 6
Y
1 1 1 1 1 1
1
36 18 18 18 18 18
1 1 1 1 1
2 0
36 18 18 18 18
1 1 1 1
3 0 0
36 18 18 18
1 1 1
4 0 0 0
36 18 18
1 1
5 0 0 0 0
36 18
1
6 0 0 0 0 0
36

Joint distribution of X and Y .

Option (a):
1 3
From the table, we know that fX (1) = , fX (2) = and so on.
36 36
It implies that X is not uniformly distributed.
Hence, option (a) is incorrect.

Option (a):
11 9
From the table, we know that fY (1) = , fY (2) = and so on.
36 36
It implies that Y is not uniformly distributed.
Hence, option (b) is incorrect.

Option (c) and (d):


From the table, we know that fXY (1, 2) = 0
1 9
fX (1) = and fY (2) = .
36 36

Therefore fXY (1, 2) 6= fX (1).fY (2).


It implies that X and Y are not independent. Hence, option (c) is incorrect and option

Page 8
(d) is correct.

Option(e):
It is clear from the table that joint distribution of X and Y is not uniformly distributed.

8. Let X and Y be i.i.d. Uniform{−1, 0, 1}. Find the value of P (X = 0 | |X + Y | = 1).


Solution:
Given that X and Y are i.i.d. Uniform{−1, 0, 1}
To find: P (X = 0 | |X + Y | = 1).
Let Z = |X + Y |

P (X = 0, |X + Y | = 1)
P (X = 0 | Z = 1) =
P (|X + Y | = 1)
P (X = 0, |Y | = 1)
=
P (|X + Y | = 1)
P (X = 0, Y = ±1)
=
P (|X + Y | = 1)
P (X = 0, Y = 1) + P (X = 0, Y = −1)
=
P (X = 0, Y = 1) + P (X = 0, Y = −1) + P (X = 1, Y = 0) + P (X = −1, Y = 0)
1
9
+ 19
= 1
9
+ 19 + 19 + 19
1
=
2

9. The joint distribution of X and Y is given by


1
fXY (x, y) = ,
2x+y
where TX and TY are both {1, 2, 3, . . .}. Find the probability mass function of max{X, Y }.
1 2k − 1
(a) fmax{X,Y } (k) = +
4k 2k−1
1 2k − 1
(b) fmax{X,Y } (k) = k + k−1
2 2
1 2k−1 − 1
(c) fmax{X,Y } (k) = 2k +
2 2k−2
1 2k−1 − 1
(d) fmax{X,Y } (k) = 2k + 2k−2
2 2

Page 9
Solution:
The joint distribution of X and Y is given by
1
fXY (x, y) = ,
2x+y

Let Z = max{X, Y }.
Clearly, range of Z will be TZ = {1, 2, 3, . . .}.

Now, choose k ∈ TZ . Then

fZ (k) = P (max{X, Y } = k)
= P (X = k, Y = k) + P (X = k, Y < k) + P (X < k, Y = k)
k−1
X k−1
X
= fXY (k, k) + fXY (k, i) + fXY (i, k)
i=1 i=1
k−1 k−1
1 X 1 X 1
= 2k + +
2 i=1
2k+i i=1 2k+i
k−1
1 X 1
= + 2
22k i=1
2k+i
k−1
1 2 X 1
= +
2k 2k i=1 2i
 
1 1 1 1 1
= + + + . . . + k−1
2k 2k−1 2 22 2
 
1 1 1 1
= + 1 + + . . . + k−2
2k 2k 2 2
 
1 1 1 − /2 1 k−1
= 2k + k
2 2 1 − 1/2
 k−1 
1 2 2 −1
= 2k + k
2 2 2k−1
1 2k−1 − 1
= 2k + 2k−2
2 2

10. Let X ∼ Binomial(5, 1/2) and Y ∼ Binomial(4, 1/4) be two independent random vari-
ables. Find the value of P (max{X, Y } = 1). Write your answer correct to two decimal
points.

Page 10
Solution:
Given that X ∼ Binomial(5, 1/2) and Y ∼ Binomial(4, 1/4) are two independent ran-
dom variables.

To find: P (max{X, Y } = 1)

P (max{X, Y } = 1) = P (X = 1, Y = 1) + P (X < 1, Y = 1) + P (X = 1, Y < 1)


= P (X = 1).P (Y = 1) + P (X < 1).P (Y = 1) + P (X = 1).P (Y < 1)
= P (X = 1).P (Y = 1) + P (X = 0).P (Y = 1) + P (X = 1).P (Y = 0)
   4 !    3 !  5 !    3 !
5 1 1 4 1 3 5 1 4 1 3
= C1 C1 + C0 C1
2 2 4 4 2 4 4
   4 !  4 !
1 1 3
+ 5 C1 4
C0
2 2 4
20 × 33 4 × 33 5 × 34
= + +
25 × 44 25 × 44 25 × 44
33
= 5 [20 + 4 + 15]
2 × 44
27 × 39
= = 0.128
32 × 256

Page 11
Statistics for Data Science - 2
Week 3 Graded assignment solutions

1. Suppose 1 in 100 products that are coming out of a production line is defective. Sup-
pose we randomly pick and keep aside products from the production line till the first
defective item is obtained. Let the random variable X represent the number of prod-
ucts that are kept aside (Assume that the first defective item is also kept aside). Find
Var(X).
1
(a)
100
99
(b)
100
(c) 100
(d) 9900

Solution:
The random variable X represent the IITMnumber of products that are kept aside (including
Logo (1).png
the first defective item) before the first defective is obtained.
It is given that 1 out of 100products
 are defective.
1
Therefore, X ∼ Geometric
100
Now,
1−p
Var(X) =
p2
1− 1
= 1 100 = 9900
( 100 )2

Hence, the correct option is (d).

2. Two coins are tossed. The probabilities of occurrence of tail on the first and the second
coin are 0.6 and 0.4, respectively. If the random variable X represents the number of
heads obtained, find the expected value of X. (Enter the answer correct to 2 decimal
points).
Answer: 1
Solution:

1
Given,

P (tail occurs on the first coin) = 0.6. (1)


P (tail occurs on the second coin) = 0.4. (2)

Random variable X denote the number of heads obtained after the tossing of two coins.
Therefore, X will take the values in {0, 1, 2}.
Now,
X
E(X) = xP (X = x)
x∈X
=0.P (X = 0) + 1.P (X = 1) + 2.P (X = 2)

P (X = 1) =P (H on first coin and T on second coin) + P (T on first coin and H on second coin)
=(0.4 × 0.4) + (0.6 × 0.6)
=0.52

P (X = 2) =P (H on both the coins)


= (0.4 × 0.6) = 0.24
IITM Logo (1).png
Therefore, E(X) = 0.52 + (2 × 0.24) = 1

3. Let the two random variables X and Y be independent with means equal to 10 and
20, and variances equal to 2 and 4, respectively. Find the value of Var(XY ).
Hint: If X and Y are independent, X 2 and Y 2 are also independent.
Answer: 1208
Solution:
Mean and variance of X is 10 and 2, respectively.
Mean and variance of Y is 20 and 4, respectively.

Var(XY) =E[(XY )2 ] − (E[XY ])2


=E[X 2 Y 2 ] − (E[X]E[Y ])2 , X and Y are independent.
=E[X 2 ]E[Y 2 ] − E[X]2 E[Y ]2 , X and Y are independent.
=(V ar(X) + E[X]2 )(V ar(Y ) + E[Y ]2 ) − E[X]2 E[Y ]2
=(2 + 102 )(4 + 202 ) − 102 202
=(102 × 404) − 40000
=41208 − 40000
=1208

2
4. Let X and Y be two independent discrete random variables. Define random variables
U and V as

X − E(X) Y − E(Y )
U= , V =
SD(X) SD(Y )
Find Cov(U, V ).
Answer: 0
Solution:
Cov(U, V ) = E(U V ) − E(U )E(V ).
Since U and V are the standardized form of random variables X and Y , respectively,

E(U ) = E(V ) = 0 and V ar(X) = V ar(Y ) = 1


Now,

Cov(U, V ) = E(U V )
  
X − E(X) Y − E(Y )
=E
SD(X) SD(Y )
1
= E[(X − E(X))(Y − E(Y ))]
SD(X)SD(Y )
= E[XY − XE(Y ) − Y E(X) + E(X)E(Y )]
IITM Logo (1).png
= E[XY ] − E[X]E(Y ) − E[Y ]E(X) + E(X)E(Y )

Since X and Y are independent, E(XY ) = E(X)E(Y )


Therefore, Cov(U, V ) = E[X]E(Y ) − E[X]E(Y ) = 0.
Use the following information to answer questions (5) and (6).
Number of people (X) who make a reservation in a restaurant a day is a random
variable with mean equal to 10 and variance equal to 2.

5. Using Markov’s inequality, find a bound on the probability that on a particular day,
the number of reservations will exceed 30.
1
(a) P (X > 30) ≤
4
1
(b) P (X > 30) ≥
3
10
(c) P (X > 30) ≤
31
10
(d) P (X > 30) >
31
Solution:

3
Random variable X represents the number of people who make reservation in a restau-
rant. It is given that
E(X) = 10 (3)
Using Markov’s inequality, we know that
µ
P (X ≥ c) ≤
c
10
Therefore, P (X > 30) = P (X ≥ 31) ≤ .
31
Therefore, the correct option is (c).
6. Find a bound on the probability that on a particular day, number of reservations made
will lie in between 6 and 14 using Chebyshev’s inequality.
7
(a) P (6 < X < 14) ≤
8
7
(b) P (6 < X < 14) ≥
8
7
(c) P (6 < X < 14) >
8
1
(d) P (6 < X < 14) ≤
8
Solution: IITM Logo (1).png

Using the Chebyshev’s inequality, we know that


1
P (| X − µ |≥ kσ) ≤ (4)
k2
1
P (µ − kσ < X < µ + kσ) ≥ 1 − (5)
k2
Given µ = 10 and σ 2 = 2
Now, we can write P (6 < X < 14) as
1
P (10 − kσ < X < 10 + kσ) ≥ 1 − 2 . Using (5)
k
Now, let
10 − kσ =6 (6)
10 + kσ =14 (7)
Solving (6) and (7), we get kσ = 4

4 16
⇒k= ⇒ k2 = =8
σ 2
1 7
Therefore, P (6 < X < 14) ≥ 1 − =
8 8
Hence, the correct option is (b).

4
7. The joint probability mass function of three discrete random variables X, Y and Z is
given as
1
p(0, 1, 2) = p(0, 2, 3) = p(1, 0, −2) =
3
Calculate Var(XY + 2Z).
52
(a)
9
32
(b)
9
80
(c)
3
56
(d)
3
Solution:
t1 t2 t3 t1 t2 + 2t3 fXY Z (t1 , t2 , t3 )
0 1 2 4 1/3
0 2 3 6 1/3
1 0 −2 -4 1/3

Joint PMF of X, Y and Z.

IITM Logo (1).png 1


XY + 2Z will take the values in {-4, 6, 4} with the probabilities each.
3
1
E(XY + 2Z) = [−4 + 6 + 4]
3
6
= =2
3
1
E[(XY + 2Z)2 ] = [(−4)2 + 62 + 42 ]
3
1
= [16 + 36 + 16]
3
68
=
3
Now,
Var(XY + 2Z) =E[(XY + 2Z)2 ] − [E(XY + 2Z)]2
68
= − 22
3
56
=
3
Hence, the correct option is (d).

5
8. An urn contains 5 white balls and 5 red balls. 2 balls are selected at random. Let
X denote the number of red balls drawn and let Y denote the number of white balls
drawn. Find the correlation coefficient between X and Y .
(a) ρ(X, Y ) = 1
(b) ρ(X, Y ) = −1
(c) ρ(X, Y ) = 0
(d) ρ(X, Y ) = −0.5
Solution:
Two balls are selected at random from the urn containing 5 white and 5 red balls.
Random variable X represent the number of red balls drawn.
Therefore, X will take values in {0, 1, 2}.
Random variable Y represent the number of white balls drawn.
Therefore, Y will take values in {0, 1, 2}.
Joint probability distribution of X and Y is given by

X
0 1 2
Y
10
0 0 0
IITM Logo (1).png
45
25
1 0 0
45
10
2 0 0
45

Joint distribution of X and Y .

Now,
     
10 25 10
E(X) = 0 × + 1× + 2×
45 45 45
=1

Similarly, E(Y ) = 1.

     
2 10 25 2 10
E(X ) = 0 × + 1× + 2 ×
45 45 45
65
=
45

6
65
Similarly, E(Y 2 ) =.
45
65 20
Now, Var(X) = Var(Y ) = − (1)2 =
45 45
   
10 25
E(XY ) = 0 × + 1× + (2 × 0)
45 45
25
=
45

Correlation coefficient between X and Y is given by

Cov(X, Y )
ρ(X, Y ) =
SD(X)SD(Y )
E(XY ) − E(X)E(Y )
= p
Var(X)Var(Y )
( 25 − 1)
= q 45
( 20
45
) × ( 20
45
)
=−1

Therefore, the correct option is (b).


IITM Logo (1).png

9. Five students each from class 8, 9 and 10 have been nominated for the formation of
the school committee. The number of boys and girls who are selected from each of the
classes is given in Table 4.1.A.

Class 8 Class 9 Class 10


Girls 2 2 3
Boys 3 3 2

Table 4.1.A: Total number of boys and girls selected.

If the committee comprises of two students from each class, find the expected number
of girls in the committee. (Enter the answer correct to 1 decimal point)
Answer: 2.8
Solution:
Let X1 represent the number of girls from class eight in the school committee.
Let X2 represent the number of girls from class nine in the school committee.
Let X3 represent the number of girls from class ten in the school committee.
We need to find E(X1 + X2 + X3 ).

7
We know that E(X1 + X2 + X3 ) = E(X1 ) + E(X2 ) + E(X3 ).

Since total number of girls selected from class eight is 2, therefore, the committee can
comprise of either 0 girl or 1 girl or 2 girls from class eight.
i.e. X1 will take values in {0, 1, 2}.
Now
3
C2 3
P (X1 = 0) = 5 =
C2 10
3 2
C1 × C1 6
P (X1 = 1) = 5 =
C2 10
2
C2 1
P (X1 = 2) = 5 =
C2 10

     
3 6 1 8
Therefore, E(X1 ) = 0× + 1× + 2× =
10 10 10 10

Similarly, total number of girls selected from class nine is 2, therefore, the committee
can comprise of either 0 girl or 1 girl or 2 girls from class nine.
8
i.e. X2 will take values in {0, 1, 2}, hence E(X2 ) = .
IITM Logo (1).png
10

Total number of girls selected from class ten is 3 and we have to select 2 students from
each class, therefore, the committee can comprise of either 0 girl or 1 girl or 2 girls
from class ten.
i.e. X3 will take values in {0, 1, 2}.
Now
2
C2 1
P (X3 = 0) = 5 =
C2 10
3 2
C1 × C1 6
P (X3 = 1) = 5 =
C2 10
3
C2 3
P (X3 = 2) = 5 =
C2 10

     
1 6 3 12
Therefore, E(X3 ) = 0× + 1× + 2× =
10 10 10 10

8
Now

E(X1 + X2 + X3 ) =E(X1 ) + E(X2 ) + E(X3 )


8 8 12
= + +
10 10 10
=2.8

Hence, expected number of girls in the class committee is 2.8.

10. A share of a company costs |1000 today. Suppose today’s share price increases by
50% with probability 0.6 and decreases by 50% with probability 0.4. Independent of
today, suppose that tomorrow’s share price increases by 20% with probability 0.2, and
decreases by 30% with probability 0.8. If you decide to buy 3 shares today, find the
expected profit (in |) at the end of 2 days.

(a) -120
(b) 360
(c) 120
(d) -360

Solution:
The cost price of a share of the company is |1000.
Let the random variable X representIITM
theLogoprice
(1).png of the share at the end of 2 days.

Price can either go up by 50% with probability 0.6 or can go down by 50% with prob-
ability 0.4 on the first day.
Independent of today, the share price can either go up by 20% with probability 0.2 or
can go down by 30% with probability 0.8.

i.e. If the share price increases by 50% on the first day, the price of the share will
become |1500.
Andthe price of the share at the end of two days if the share prices increases by 20%
20
is | 1500 × + 1500 =|1800 with probability (0.6 × 0.2) = 0.12.
100
Similarly,the price of the share
 at the end of two days if the share prices decreases by
30
30% is | 1500 − 1500 × =|1050 with probability (0.6 × 0.8) = 0.48.
100

Again, if the share price decreases by 50% on the first day, the price of the share will
become |500.
Andthe price of the share
 at the end of two days if the share prices increases by 20%
20
is | 500 × + 500 =|600 with probability (0.4 × 0.2) = 0.08.
100
Similarly, the price of the share at the end of two days if the share prices decreases by

9
 
30
30% is | 500 − 500 × =|350 with probability (0.4 × 0.8) = 0.32.
100

Therefore, X will take values in {1800, 1050, 600, 350}, where

P (X = 1800) = 0.12
P (X = 1050) = 0.48
P (X = 600) = 0.08
P (X = 350) = 0.32
Now,

E(X) =(1800 × 0.12) + (1050 × 0.48) + (600 × 0.08) + (350 × 0.32)


=880

The expected gain at the end of two days if you buy one share is |(880-1000) = -|120.
Therefore, if you buy 3 shares of the company, expected gain will be -|360.

Hence, the correct option is (d).

11. A lottery has 500 tickets out of which only 2 tickets contain prizes worth |500 and
|1,000; the rest are worth |0. If one has bought 2 tickets, what will be his/her ex-
pected gain (in |)? IITM Logo (1).png
Answer: 6

Solution:
In the lottery, only two tickets out of 500 contain prizes worth |500 and |1,000.
If one has bought two tickets, one can get the prizes worth |0, |500, |1,000 and |1,500.
Let the random variable X represent the worth of the prizes of two tickets.
Therefore, X will take values in {0, 500, 1000, 1500}.

498
C2
P (X = 0) = P (Both the tickets are worth |0) = 500
C2
498
C 1 1C 1
P (X = 500) = P (One of the ticket is worth |0 and the other is worth |500) = 500
C2
498
C 1 1C 1
P (X = 1000) = P (One of the ticket is worth |0 and the other is worth |1000) = 500
C2
P (X = 1500) = P (One of the ticket is worth |500 and the other is worth |1000) =
2
C2
500
C2

10
498 498
C 1 1C 1 498
C 1 1C 1 2
       
C2 C2
E(X) = 0 × 500 + 500 × 500 + 1000 × 500 + 1500 × 500
C2 C2 C2 C2
1
= 500 [500 × 498C 1 + 1000 × 498C 1 + 1 × 1500]
C2
1
= 500 [249000 + 498000 + 1500]
C2
748500
= =6
124750
Therefore, the expected gain is |6.

IITM Logo (1).png

11
Statistics for Data Science - 2
Week 3 Practice Assignment
Expectation and variance

5 15
1. If the expected value and variance of the Binomial random variable X are and ,
2 8
respectively, then find the value of P (X = 10).

 10
3
(a)
4
 10
3
(b) 10
4
 10
1
(c)
4
 10
1
(d) 10
4
Solution: If X ∼ Binomial(n, p), then expected value and variance of X is given by np
and np(1 − p), respectively.

Given that
5
E[X] = np = ...(1)
2
And
15
Var(X) = np(1 − p) = ..(2)
8
Putting the value of np in the equation (2) from equation (1), we get
3 1
(1 − p) = ⇒ p = .
4 4
Putting the value of p in equation (1), we get
n = 10
It implies that X ∼ Binomial 10, 14


Therefore,  10  0  10


10 1 3 1
P (X = 10) = C10 =
4 4 4

1 1
2. X and Y are two independent geometric random variables with parameters and ,
2 4
respectively. Find the value of Var(X + 2Y ).
Solution:
1−p
We know that if X ∼ Geometric(p), then Var(X) =
p2
1
1− 2
Therefore, Var(X) = 1 =2 ...(1)
4
1
1− 4
Var(Y ) = 1 = 12 ...(2)
16

Now, since X and Y are independent, we have

Var(X + 2Y ) = Var(X) + 22 Var(Y )


2 + 48 = 50

3. The number of spam messages (X) sent to a server in a day has Poisson distribution
with parameter λ = 21. Each spam message independently has a probability of p = 13
of not being detected by the spam filter. Let Y denote the number of spam messages
detected by the filter in a day. Calculate the expected value of X + Y .

solution:
X denotes the number of spam messages sent to the server in a day and

X ∼ Poisson(21)

Y denotes the number of spam messages detected by the filter in a day.


1
It is given that each spam messages independently has a probability of of not being
3
detected. It implies that
2
Y |X ∼ Binomial(X, )
3
Recall that if N ∼ Poisson(λ) and Z|N ∼ Binomial(N, p), then Z ∼ Poisson(λp).

Therefore, Y ∼ Poisson(14)

E[X] = 21 and E[Y ] = 14


⇒ E[X + Y ] = E[X] + E[Y ] = 35

4. Two random variables X and Y are jointly distributed with the joint pmf
1
fXY (x, y) = (x + y),
9
where x and y are integers in 0 ≤ x ≤ 2 and 0 ≤ y ≤ 1. Let Z = XY + Y 2 . Find the
expected value of Z.
1
(a)
3

Page 2
4
(b)
3
2
(c)
3
14
(d)
9
Solution:

E[Z] = E[XY + Y 2 ]
X
= (xy + y 2 )fXY (x, y)
0≤x≤2;0≤y≤1
1 X
= (xy + y 2 )(x + y)
9 0≤x≤2;0≤y≤1
1
= (1 + 4 + 9)
9
14
=
9
5. The distribution of a certain company’s employees’ monthly salary has mean |60000 and
standard deviation |20000. The probability that a randomly selected employee from that
company has a salary either greater than or equal to |100000 or less than or equal to
|20000 is:
1
(a) at least
4
1
(b) at most
4
1
(c) at least
2
1
(d) at most
2
Solution:
Let X denote the employees’ monthly salary.
Given that E[X] = µ = 60000 and SD= σ = 20000.

P (X ≥ 100000 or X ≤ 20000) = P (X − 60000 ≥ 40000 or X − 60000 ≤ −40000)


= P (|X − 60000| ≥ 40000)
= P (|X − µ| ≥ 2σ)
By using Chebyshev’s inequality
1

4

Page 3
Hence, probability that a randomly selected employee from that company has a salary
either greater than or equal to |100000 or less than or equal to |20000 is at most 14 .

6. Two random variables X and Y are jointly distributed with the joint pmf
1
fXY (x, y) = (xy + x + y + 1),
27
where x and y are integers in 0 ≤ x ≤ 1 and 1 ≤ y ≤ 3. Find the correlation coefficient
of X and Y .
Solution:

X
E[X] = xfXY (x, y)
x∈TX ,y∈YY
1 X
= x(xy + x + y + 1)
27 x∈T ,y∈Y
X Y

1
= (4 + 6 + 8)
27
18 2
= =
27 3

X
E[Y ] = yfXY (x, y)
x∈TX ,y∈YY
1 X
= y(xy + x + y + 1)
27 x∈T ,y∈Y
X Y

1
= (2 + 6 + 12 + 4 + 12 + 24)
27
60 20
= =
27 9

X
E[XY ] = xyfXY (x, y)
x∈TX ,y∈YY
1 X
= xy(xy + x + y + 1)
27 x∈T ,y∈Y
X Y

1
= (4 + 12 + 24)
27
40
=
27

Page 4
Cov(X, Y ) = E[XY ] − E[X]E[Y ]
40 2 20
= − .
27 3 9
=0

We know that
Cov(X, Y )
Correlation coefficient = p =0
Var(X)Var(Y )

7. Let X and Y be two independent random variables such that X ∼ Binomial(4, 12 ) and
Y ∼ Uniform({1, 2, 3}). Find the value of Cov(2X + Y , X + Y 2 X).

(a) 16.67
(b) 6.67
(c) 13.37
(d) 0

Solution:

Cov(2X + Y, X + Y 2 X) = Cov(2X, X + Y 2 X) + Cov(Y, X + Y 2 X)


= Cov(2X, X) + Cov(2X, Y 2 X) + Cov(Y, X) + Cov(Y, Y 2 X)
= 2Cov(X, X) + 2Cov(X, Y 2 X) + Cov(Y, X) + Cov(Y, Y 2 X)
= 2Var(X) + 2(E[X 2 Y 2 ] − E[X]E[Y 2 X]) + (E[XY ] − E[X]E[Y ])
+ (E[XY 3 ] − E[Y ]E[Y 2 X])

Since X and Y are independent random variables, (X 2 , Y 2 ), (X, Y 2 ), (X, Y 3 ) are also
independent. It implies that
E[X 2 Y 2 ] = E[X 2 ]E[Y 2 ]
E[Y 2 X] = E[Y 2 ]E[X]
E[XY 3 ] = E[X]E[Y 3 ]

Therefore,

Cov(2X + Y, X + Y 2 X) = 2Var(X) + 2(E[X 2 ]E[Y 2 ] − E[X]2 E[Y 2 ]) + (E[XY ] − E[X]E[Y ])


+ (E[X]E[Y 3 ] − E[Y ]E[Y 2 ]E[X])
= 2Var(X) + 2(E[X 2 ]E[Y 2 ] − E[X]2 E[Y 2 ]) + E[X]E[Y 3 ] − E[Y ]E[Y 2 ]E[X]

Page 5
Now, X ∼ Binomial(4, 12 )
Therefore, E[X] = np = 2
Var(X) = np(1 − p) = 1
E[X 2 ] = Var(X) + (E[X])2 = np(1 − p) + (np)2 = 1 + 4 = 5

And Y ∼ Uniform({1, 2, 3})


E[Y ] = 31 (1 + 2 + 3) = 2
E[Y 2 ] = 31 (1 + 4 + 9) = 14
3
E[Y 3 ] = 31 (1 + 8 + 27) = 12

Therefore,

70 56 56
Cov(2X + Y, X + Y 2 X) = 2(1) + 2( − ) + 24 −
3 3 3
28
= 26 −
3
= 16.67

8. The joint distribution of two random variables X and Y is given as:

X
0 1 2
Y
2 5
-1 0
17 17
1 2
0 0
17 17
3 4
1 0
17 17

Table 3.1.P: Joint distribution of X and Y .

Find the standard deviation of the product of the two random variables. (Write your
answer correct up to two decimal points.)
Solution:
To find: SD(XY )

Page 6
X
E[XY ] = xyfXY (x, y)
x∈TX ,y∈TY
2 5 4
= −1( ) − 2( ) + 2( )
17 17 17
−4
=
17

X
E[(XY )2 ] = = x2 y 2 fXY (x, y)
x∈TX ,y∈TY
2 5 4
= 1( ) + 4( ) + 4( )
17 17 17
38
=
17

Var(XY ) = E[(XY )2 ] − [E[XY ]]2


38 16
= −
17 289
630
=
289
Therefore,
p
SD(XY ) = Var(XY )
r
630
= = 1.47
289

9. An ice-cream seller sells ice creams at three prices: |30, |40, and |50. A random cus-
tomer will buy an ice cream of |30, |40 and |50 with probabilities of 0.5, 0.3, and
0.2, respectively. If the number of customers in a day follows Poisson distribution with
λ = 60, what is the expected sales (in |) of the seller in a day?
Solution:
Let X denote the number of customers coming to the ice-cream seller in a day, then

X ∼ Poisson(60)

Let Y denote the price at which the customer buys the ice-cream, then
E[Y ] = 30(0.5) + 40(0.3) + 50(0.2) = 37

If X = x customers comes at the shop, then expected sale will be xE[Y ]

Page 7
But since X ∼ Poisson(60), on an average 60 customers come to the ice-cream seller in
a day. It means that expected sale of the day will be

60E[Y ] = 60(37) = 2220

10. An urn contains 10 balls numbered from 1 to 10. We remove six balls randomly and add
up their numbers. Let X denote the sum of the numbers of the removed balls. Find the
expected value of X.
6
P
(Hint: Suppose Xi denotes the number of the ith removed ball, then X = Xi )
i=1

Solution:
Let Xi , i = 1, 2, ...6 denote the number on the ith ball, then
P6
X= (Xi )
i=1 6 
P
⇒ E[X] = E (Xi )
6 i=1 
P
⇒ E[X] = E(Xi )
i=1
⇒ E[X] = 6E(Xi ) ...(1)
1 11
Now, E[Xi ] = [1 + 2 + 3 + ...10] =
10 2
Putting the value in equation (1), we get
11
E[X] = 6 × = 33
2

Page 8
Statistics for Data Science - 2

Week 4 Graded Assignment solution


Continuous random variable

1. The CDF of a random variable X is


(
1 − e−3x x≥0
FX (x) =
0 otherwise

i) Find P (X > 4).


a) e−3 − e−4
b) e−12
c) e−7
d) e−3 e−4
Solution:

P (X > 4) = 1 − P (X ≤ 4) = 1 − FX (4)
= 1 − (1 − e−3×4 )
= e−12

ii) Find the value of P (−5 < X ≤ 6).

a) 1 − e−18
b) e−5 − e−18
c) e−18
d) e−9
Solution:

P (−5 < X ≤ 6) = FX (6) − FX (−5)


= (1 − e−3×6 ) − 0
= 1 − e−18

2. Let X be a continuous random variable with the following PDF:


(
ke−x x ≥ 0
fX (x) =
0 otherwise

1
i) Find the value of k.
Solution:
We know that for PDF of the random variable
Z ∞
fX (x) = 1
−∞
Z ∞
⇒ ke−x dx = 1
0

e−x
 
⇒k =1
−1
0

⇒ k(0 + 1) = 1 (As x approaches to ∞, e−x approaches to 0)


⇒k=1
ii) Find P (3 < X < 4).

a) e−1
b) e−3 e−4
c) e−3 − e−4
d) e−4 − e−3
Rb
Hint: Use a e−x dx = e−a − e−b
Solution:

Z 4
P (3 < X < 4) = ke−x dx
3
4
e−x
 
=1×
−1
3
e−4 e−3
 
= −
−1 −1
−3 −4
=e −e

3. Let X be a continuous random variable with PDF


(
5x4 0 < x ≤ 1
fX (x) =
0 otherwise
3
Find P (X ≤ 4
| X > 41 ).
3
a)
16
2
17
b)
86
22
c)
93
9
d)
22
Rb
Hint: Use a
5x4 dx = b5 − a5

3 1 P (X ≤ 43 and X > 14 )
P (X ≤ |X> )=
4 4 P (X > 14 )
R 3/4 4
1/4
5x dx
= R1
1/4
5x4 dx
3/4
5x5
5
1/4
= 1
5x5
5
1/4
3/4

x5
3 1 1/4
⇒ P (X ≤ |X> )= 1
4 4
x5
1/4
3 5
(4) − ( 14 )5
=
1 − ( 14 )5
22
=
93

4. The lifespan (in hours) of an electronic component used in an electric car has the density
function ( x
1 − 500
500
e x≥0
fX (x) =
0 otherwise
Determine the probability that the component lasts more than 200 hours before it needs
to be replaced.

a) e−0.4
b) e200

3
c) 0.5
d) e−2.5
Solution:
Let X denote the lifespan (in hours) of the electronic component. We have to find the
probability that the component lasts more than 200 hours before it needs to be replaced
i.e.
P (X > 200) = 1 − P (X ≤ 200)
1
Also, we can relate the given density with the exponential distribution with λ = 500 .

⇒ P (X > 200) = 1 − P (X ≤ 200)


= 1 − FX (200)
200
= 1 − (1 − e− 500 )
= e−0.4

5. A firm produces machines with a lifespan, whose distribution has a mean of 200 months
and standard deviation of 50 months. The firm wishes to introduce a warranty scheme
in which it would like to replace all the dysfunctional machines with new ones within
warranty period. But they do not wish to do so for more than 11.9% of the machines
they produce. If the lifespan of the machine is assumed to follow a normal distribution,
how long a guarantee period should be offered? (Answer is expected in months)
Hint: Use P (Z < −1.18) = 0.119, where Z represents the standard normal distribution.
Solution:
Let X denote the lifespan of the machines in months. Given that µ = 200 and σ = 50.
The firm did not wish to replace more than 11.9% of the machines they produce.
If m be the guarantee period (in months), then
P (X ≤ m) = 0.119
 
X − 200 m − 200
⇒P ≤ = 0.119
50 50
Comparing this equation with the given value of standard normal distribution we will get

m − 200
= −1.18
50
⇒ m = 141

6. Let X be a continuous random variable with the following PDF:


(
3(1 − x)2 0 < x < 1
fX (x) =
0 otherwise

Define Y = (1 − X)3 . Find the PDF of the random variable Y .

4
a) (
1 0<y<1
fY (y) =
0 otherwise
b) (
(1 − y)3 0<y<1
fY (y) =
0 otherwise
c) (
y3 0<y<1
fY (y) =
0 otherwise
d) (
3y 2/3 0<y<1
fY (y) =
0 otherwise
Hint:
d
Apply the monotonic, differentiable function theorem and (1 − x)3 = −3(1 − x)2
dx
Solution:
We know that in the range (0, 1), (1 − x)3 is monotonic (decreasing function).
1
Therefore, we can use the formula, fY (y) = 0 −1 fX (g −1 (y))
|g (g (y))|
Given Y = (1 − X)3 = g(X)(let)
⇒ y 1/3 = 1 − x, ⇒ x = 1 − y 1/3 = g −1 (y)
Therefore g −1 (y) = 1 − y 1/3
d
g(x) = (1 − x)3 ⇒ g 0 (x) = −3(1 − x)2 , since (1 − x)3 = −3(1 − x)2
dx
And
g 0 (g −1 (y)) = g 0 (1 − y 1/3 ) = −3(1 − (1 − y 1/3 ))2 = −3y 2/3
|g 0 (g −1 (y))| = 3y 2/3 , since y 2/3 is positive in the range (0, 1).
fX (g −1 (y)) = fX (1 − y 1/3 ) = 3(1 − (1 − y 1/3 ))2 = 3y 2/3
3y 2/3
Therefore, fY (y) = 2/3
3y
⇒ fY (y) = 1
Therefore
(
1 0<y<1
fY (y) =
0 otherwise

7. Let X be a continuous random variable with the following PDF:


(
x2 /81 −6 < x < 3
fX (x) =
0 otherwise

5
Define Y = 13 (12 − X). Find the PDF of the random variable Y .

a) (
(12 − 3y)2 /27 −6 < y < 3
fY (y) =
0 otherwise

b) (
(12 − 3y)2 /27 3 < y < 6
fY (y) =
0 otherwise

c) (
(12 − 3y)/27 −6 < y < 3
fY (y) =
0 otherwise

d) (
(12 − 3y)/27 3 < y < 6
fY (y) =
0 otherwise

Solution:
We know that in the range (-6, 3), 31 (12 − x) is monotonic (decreasing function).
1
Therefore, we can use the formula, fY (y) = 0 −1 fX (g −1 (y))
|g (g (y))|
Given Y = 31 (12 − X) = g(X)(let)
⇒ 3y = 12 − x, ⇒ x = 12 − 3y = g −1 (y)
Therefore g −1 (y) = 12 − 3y
g(x) = 31 (12 − x) ⇒ g 0 (x) = − 13
And
g 0 (g −1 (y)) = g 0 (12 − 3y) = − 13
|g 0 (g −1 (y))| = 13
(12 − 3y)2
fX (g −1 (y)) = fX (12 − 3y) =
81
(12 − 3y)2
Therefore, fY (y) = 81
1
3
(12 − 3y)2
⇒ fY (y) =
27
When x = −6, y = 6 and x = 3, y = 3.
Therefore

 (12 − 3y)2
3<y<6
fY (y) = 27
0 otherwise

6
8. Let X be a continuous random variable with the following PDF:
(
x3 (6x2 + 5x − 4) 0 < x ≤ 1
fX (x) =
0 otherwise

Find the value of E[X].


523
a)
210
23
b)
210
173
c)
210
187
d)
210
Rb 1
Hint: Use a
xn dx = n+1
(bn+1 − an+1 )
Solution:

Z ∞
E[X] = xfX (x)dx
−∞
Z 1
= x × x3 (6x2 + 5x − 4)dx
Z0 1
= (6x6 + 5x5 − 4x4 )dx
0
1 1 1
6x7 5x6 4x5
= + −
7 6 5
0 0 0
6 5 4
= + −
7 6 5
187
=
210

9. Let X be a continuous random variable with the following PDF:



x
 0≤x≤1
fX (x) = 2 − x 1 < x ≤ 2

0 otherwise

Define Y = 6X + 5. Find the variance of Y.


6

7
Rb 1
Use a
xn dx = n+1
(bn+1 − an+1 )
Rb Rc Rb
Also, a xn dx = a xn dx + c xn dx where a < c < b.
Solution:
Var(Y ) = Var(6X + 5) = 36Var(X)
And Var(X) = E[X 2 ] − (E[X])2
Z ∞
E[X] = xfX (x)dx
−∞
Z 2
= xfX (x)dx
0
Z 1 Z 2
= xfX (x)dx + xfX (x)dx
0 1
Z 1 Z 2
= x.xdx + x(2 − x)dx
0 1
1 2 2
x3 2x2 x3
= + −
3 2 3
0 1 1
1 (23 − 13 )
= + (22 − 12 ) −
3 3
1 7
= +3−
3 3
=1
Z ∞
2
E[X ] = x2 fX (x)dx
−∞
Z 2
= x2 fX (x)dx
Z0 1 Z 2
2
= x fX (x)dx + x2 fX (x)dx
0 1
Z 1 Z 2
2
= x .xdx + x2 (2 − x)dx
0 1
1 2 2
4 3
x 2x x4
= + −
4 3 4
0 1 1
1 2 1
= + (23 − 13 ) − (24 − 14 )
4 3 4
1 14 15
= + −
4 3 4
7
=
6

8
Therefore,
Var(X) = 67 − 1 = 16
⇒ Var(Y ) = 36 × 16 = 6

9
Statistics for Data Science - 2

Week 4 Practice Assignment Solution

1. The probability density function of a continuous random variable X is shown in Figure


4.1.P.

Figure 4.1.P: Probability Density Function graph of X

The PDF is defined as follows:


(
e−x x≥0
fX (x) =
0 x<0
Find P (− < X < 0), where  is a very small positive number.

(a) e
(b) 0
(c) e−
(d) e−2

Answer: b
Solution: R0
We know that P (− < X < 0) = − fX (x)dx
But the value of fX (x) is zero in the range − to zero.
Therefore, P (− < X < 0) = 0.
Therefore, option b is the correct option.

1
2. Which of the following statements is/are true for a continuous random variable with
PDF fX (x)?

(a) If fX (2) = 2fX (1), then P (2 −  < X < 2 + ) = 2P (1 −  < X < 1 + ) for a small
.
(b) If fX (2) = 2fX (1), then P (2 −  < X < 2 + ) ≈ 2P (1 −  < X < 1 + ) for a small
.
(c) P (X = x0 ) = 0 for any value of x0 .
(d) CDF FX (x) is continuous in the domain [−∞, ∞].

Answer: b, c, and d

Solution:
Option a: We know that for small , P (x −  < X < x + ) ∝ fX (x).
Therefore, P (1 −  < X < 1 + ) ∝ fX (1) and P (2 −  < X < 2 + ) ∝ fX (2)
But P (x −  < X < x + ) is not exact linear function of fX (x).
Therefore when fX (2) = 2fX (1), then P (2 −  < X < 2 + ) 6= 2P (1 −  < X < 1 + )
but P (2 −  < X < 2 + ) ≈ 2P (1 −  < X < 1 + )
Hence option a is wrong but option b is correct.
Option c: The probability at an instant (PX (x)) for a continuous random variable is
zero as there is no sudden spike in the CDF function for any value of x. Hence option
c is correct.
Option d: For a continuous random variable CDF is always continuous.

3. If 
 1 (x2 − 8x + 16) 1 ≤ x ≤ 7
fX (x) = 18
0 otherwise
What is the value of P (X ≤ 4)? Enter the answer correct to one decimal accuracy.
R xa+1
( xa dx = )
a+1
Answer: 0.5

Solution: R
4
P (X ≤ 4) = −∞ fX (x)dx
R4
⇒ P (X ≤ 4) = 1 fX (x)dx, since fX (x) = 0 for x < 1.
R4 1 2
⇒ P (X ≤ 4) = 1 ( 18 (x − 8x + 16))dx
1 3
⇒ P (X ≤ 4) = (x /3 − 8x2 /2 + 16x/1)|41
18
1 1
⇒ P (X ≤ 4) = (43 /3 − 4 ∗ 42 + 16 ∗ 4) − (13 /3 − 4 ∗ 12 + 16 ∗ 1)
18 18
⇒ P (X ≤ 4) = 0.5

2
4. If X ∼ Normal(10, 25), what is the value of E[2X 2 ]?
Answer: 250

Solutions:
Given E[X]=10, Var(X)=25
We know that Var(X)= E[X 2 ] − E[X]2
⇒ E[X 2 ] = Var(X) + E[X]2
⇒ E[X 2 ] = 25 + 102 = 125
We know thatE[cX] = cE[X], where c is a constant.
⇒ E[2X 2 ] = 2E[X 2 ]
⇒ E[2X 2 ] = 2 × 125 = 250

5. If X ∼ Normal(10, 4), then what is the value of P (X ≥ 8|X ≤ 9)? Use the standard
normal distribution tables if necessary. Enter the answer up to two decimals accuracy.
Use the following CDF values of standard normal distribution.
FZ (−2) = 0.02275, FZ (−1.5) = 0.06681, FZ (−1) = 0.15866, FZ (−0.5) = 0.30854, FZ (0) =
0.5, FZ (0.5) = 0.69146, and FZ (1) = 0.84134
Answer: 0.485 accepted range 0.48 to 0.49

Solution:
Given µ = 10, σ 2 = 4 ⇒ σ = 2
We need to find P (X ≥ 8|X ≤ 9).
P (X ≥ 8 ∩ X ≤ 9)
P (X ≥ 8|X ≤ 9) =
P (X ≤ 9)
FX (9) − FX (8)
P (X ≥ 8|X ≤ 9) =
FX (9)
Converting present normal distribution to standard distribution to get values of FX (x).
x−µ 8 − 10
For x = 8, z = = = −1, ⇒ FX (8) = FZ (−1)
σ 2
x−µ 9 − 10
For x = 9, z = = = −0.5, ⇒ FX (9) = FZ (−0.5)
σ 2
FX (9) − FX (8)
P (X ≥ 8|X ≤ 9) =
FX (9)
0.30854 − 0.15866
⇒ P (X ≥ 8|X ≤ 9) = = 0.485
0.30854

6. A random variable X has the following PDF


(
2x 0 ≤ x ≤ 1
fX (x) =
0 otherwise

Define Y = eX . What is the PDF fY (y) of Y ?

3

 2 log(y) 1≤y≤e
(a) fY (y) = y
0 otherwise


 log(y)
1≤y≤e
(b) fY (y) = 2ey
0 otherwise

 log(y) 1≤y≤e
(c) fY (y) = y
0 otherwise


 log(y)
1≤y≤e
(d) fY (y) = ey
0 otherwise

 log(y) 1≤y≤e
(e) fY (y) = 2y
0 otherwise

Answer: a

Solution:
Given Y = g(X) = eX
⇒ log y = x = g −1 (y)
Therefore g −1 (y) = log(y)
d(ex )
g(x) = ex , ⇒ g 0 (x) = ex Since = ex
dx x
We know that in the range 0 to 1, e is monotonic (increasing function).
1
Therefore, we can use the formula, fY (y) = 0 −1 fX (g −1 (y))
|g (g (y))|
g 0 (g −1 (y)) = g 0 (log y) = elog y = y
|g 0 (g −1 (y))| = y since y is positive in the range [1, e]
fX (g −1 (y)) = fX (log y) = 2 log y
1
Therefore, fY (y) = log y
y
2 log y
fY (y) =
y
Hence option a is correct.

Use the following information to answer the questions 7 and 8.

4
The CDF of random variable X is given below:


 0 x≤0

2x2 0 ≤ x ≤ 12



FX (x) = 12 1
2
≤x≤1
 x
1≤x≤2




 2
1 x≥2

Use the following derivative formula:

d(xa )
= axa−1
dx
7. Which of the following statements is/are correct?

(a) X is a continuous random variable.


(b) X is a discrete random variable.
(c) The PDF of X is not defined as X is discrete random variable.


 0 x≤0
1

4x 0 ≤ x ≤ 2



1
(d) The PDF of random variable X is fX (x) = 0 2
≤x≤1
 x

 1≤x≤2
2



0 x>2


 0 x<0

2 1
2x 0 ≤ x ≤ 2



1
(e) The PDF of random variable X is fX (x) = 0 2
<x<1
 x
1≤x≤2




 4
0 x>2

Answer: a, d
Solution:
d(FX (x))
We know that fX (x) =
dx
Given 

 0 x≤0

2x2 0 ≤ x ≤ 12



1 1
FX (x) = 2 2
≤x≤1
 x
1≤x≤2




 2
1 x≥2

5
d(0


 =0 x≤0



 dx



d(2x2 )


1
= 4x 0≤x≤


2




 dx



 1
d( 2 )
⇒ fX (x) = =0 1
<x≤1
2


 dx



d( x2 )


= 12


 1<x≤2
dx







 d(1) = 0



x>2
 dx


 0 x≤0
1
4x 0 ≤ x ≤ 2




1
Therefore, fX (x) = 0 2
<x≤1
 1
1<x≤2





 2
0 x>2
Since, FX (x) is continuous in the given domain, hence X is a continuous random
variable.
8. What is the value of P (X ≥ 1|X ≤ 1.5)? Enter the answer correct to two decimals
accuracy.
Answer: 0.33, accepted range 0.31 to 0.35
Solution:
FX (1.5) − FX (1) 1.5/2 − 1/2
P (X ≥ 1|X ≤ 1.5) = = = 1/3
FX (1.5) 1.5/2
9. The time taken by Rohith to complete a race follows the exponential distribution with
expected time of completion of 10 minutes. What is the probability that Rohith takes
less than 20 minutes but more than 10 minutes to complete the race? Enter the answer
e−ax
correct to 2 decimals accuracy. ( e−ax dx =
R
)
−a
Answer: 0.2325, accepted range: 0.23 to 0.235
Solution:
Given E[X] = 10 minutes.
We know for a exponential distribution E[X] = λ1
⇒ λ1 = 10, λ = 0.1
For exponential distribution FX (x) = 1 − e−λx
The probability that athlete takes more than 10 minutes is,
FX (10) = 1 − e−0.1×10 = 1 − e−1
The probability that athlete takes more than 20 minutes is,

6
FX (20) = 1 − e−0.1×20 = 1 − e−2
The probability that athlete takes more than 10 minutes but less than 20 minutes to
complete race is FX (20) − FX (10) = e−1 − e−2 = 0.232 approximately.

10. The PDFs of random variables X1, X2, X3, X4, and X5 are shown in Figure 4.2.P.
Based on the information, choose the correct option(s) from below.

Figure 4.2.P: PDF of Normal Distributions for different variables.

(a) E(X1) ≈ E(X5) < E(X2) < E(X4) < E(X3)


(b) E(X1) < E(X5) < E(X2) < E(X4) < E(X3)
(c) E(X1) < E(X5) = E(X2) < E(X4) < E(X3)
(d) Var(X1) < Var(X3) < Var(X4) < Var(X5)
(e) Var(X1) ≈ Var(X2) < Var(X3) < Var(X4) < Var(X5)

Answer: a, d, and e
Solution:
We know that in the PDF of normal distribution, the peak value occurs at mean.
E[X] = µ(mean)
Also, the value of PDF at mean is inversely proportional to standard deviation
1
Since, fX (µ) = √ .
2πσ
The peak value, which is mean or E[X], of PDF occurs approximately for X1, X2, X3, X4,
and X5 at -10, 0, 20, 10, and -10 respectively.
Therefore, E(X1) ≈ E(X5) < E(X2) < E(X4) < E(X3)
The peak value (fX (µ)) for variables X1, X2, X3, X4, and X5 are such that fX1 (µ) ≈
fX2 (µ) > fX3 (µ) > fX4 (µ) > fX5 (µ).
Therefore, Var(X1) ≈ Var(X2) < Var(X3) < Var(X4) < Var(X5)
Hence, options a, d, and e correct.

7
11. The PDF of a continuous random variable is given as
(
4x3 0 ≤ x ≤ 1
fX (x) =
0 otherwise
R xa+1
What is the value of Var(X)? ( xa dx = )
a+1
1
(a)
75
2
(b)
75
3
(c)
75
4
(d)
75
Answer: b
We knowR that Var(X) = E[X 2 ] − E[X]2
E[X] = xfX (x)dx
R1
E[X] = 0 x ∗ 4x3 dx
R1
⇒ E[X] = 0 4x4 dx

4x5 1
⇒ E[X] = |
5 0

4 4
⇒ E[X] = 5
−0= 5
R
E[X 2 ] = x2 fX (x)dx
R1
E[X] = 0 x ∗ 4x4 dx
6
⇒ E[X] = 4x6 |10

4 2
⇒ E[X] = 6
−0= 3

Therefore, Var(X) = E[X 2 ] − E[X]2

2
Var(X) = 3
− ( 45 )2

2 16
Var(X) = 3
− 25

2
Var(X) = 75

12. Let X ∼ Uniform(a1 , b1 ) and Y ∼ Uniform(a2 , b2 ). Based on this information, choose


the correct option(s) from below.

8
(a) If b2 − a2 = b1 − a1 , then Var(X) = Var(Y ).
(b) If b2 + a2 = b1 + a1 , then Var(X) = Var(Y ).
(c) If b2 − a2 = b1 − a1 , then E(X) = E(Y ).
(d) If b2 − b1 = a1 − a2 , then E(X) = E(Y ).
Answer: a and d
Solution:
We know that mean (E(X)) and Variance (Var(X)) of uniform random variable (X ∼
a+b (b − a)2
Uniform(a, b) is and respectively.
2 12
Given X ∼ Uniform(a1 , b1 ) and Y ∼ Uniform(a2 , b2 ),
a1 + b 1 a2 + b 2
E(X) = , E(Y ) = . So, for E(X) to be equal to E(Y ), a1 + b1 = a2 + b2
2 2
or b2 − b1 = a1 − a2 . Hence option d is correct and option c is incorrect.
(b1 − a1 )2 (b2 − a2 )2
Similarly for Var(X) to be equal to Var(Y ), = or b1 −a1 = b2 −a2 ,
12 12
hence option a is correct and option b is incorrect.
13. The CDF of a random variable X is given as:

0 x x<0


FX (x) = 0 ≤ x ≤ ln 2
 ln 4
1 − e−x ln 2 ≤ x < ∞

Derivative formulas required to solve the problem:


d(ax)
=a
dx
d(e−ax )
= −ae−ax
dx
The PDF of the random variable X is:


 0 x<0
1

(a) fX (x) = 0 ≤ x < ln 2
 ln
 −x
 4
e ln 2 ≤ x < ∞

0
 x<0
(b) fX (x) = 1 0 ≤ x < ln 2

 −x
e ln 2 ≤ x < ∞


 0 x<0
1

(c) fX (x) = 0 ≤ x ≤ ln 2
 ln
 −x
 2
e ln 2 < x < ∞

9


 0 x<0
1

(d) fX (x) = 0 ≤ x < ln 2
 lnx 2


e ln 2 ≤ x < ∞

Answer: a
Solution:
d(FX (x))
We know that fX (x) =
dx
Given, 
0 x<0


x
FX (x) = 0 ≤ x ≤ ln 2
 ln 4
1 − e−x

ln 2 ≤ x < ∞
Therefore,
d(0)


 =0 x<0



 dx



 x
d( )

fX (x) = ln 4 = 1 0 ≤ x ≤ ln 2
dx ln 4







−x
 d(1 − e ) = e−x



ln 2 ≤ x < ∞
dx
Hence option a is correct.

10
Statistics for Data Science - 2
Week 5 graded Assignment
Solution

1. A person randomly chooses a battery from a store which has 40 batteries of type A and
60 batteries of type B. Battery life of type A and type B batteries are exponentially
distributed with average life of 4 years and 6 years, respectively. If the chosen battery
lasts for 5 years, what is the probability that the battery is of type A?
1
(a) 5
1 + e 12
1
(b) −5
1 + e 12
−4
e5
(c) −6
1+e 5
−6
e5
(d) −4
1+e 5

Solution:
Define a event X as follows:
(
1 If the chosen battery is of type A
X=
0 If the chosen battery is of type B

Let Y denote the battery life of the chosen battery.


By the given information, we have
Y |X = 1 ∼ Exp( 14 ) and

Y |X = 0 ∼ Exp( 16 )

It implies that

−y
fY |X=1 (y) = 14 e 4 ; y > 0 and

−y
fY |X=0 (y) = 16 e 6 ;y > 0

Also given that

40 2
P (X = 1) = = and
100 5
60 3
P (X = 0) = =
100 5

To find: fX|Y =5 (1). Now,

fY |X=1 (5).P (X = 1)
fX|Y =5 (1) =
fY (5)

fY |X=1 (5).P (X = 1)
=
fY |X=1 (5).P (X = 1) + fY |X=0 (5).P (X = 0)

1 −5
4
e 4 . 52
= 1 −5 −5
4
e 4 . 52 + 16 e 6 . 35

1 −5
10
e4
= 1 −5
1 −5
10
e 4 + 10 e6

−5
e 4
= −5 −5
e 4+e 6

1
= 5
1 + e 12

2. Let Y = XZ + X, where X ∼ Uniform{1, 2, 3} and Z ∼ Normal(1, 4) are independent.


Find the value of fX|Y =2 (2).

3 exp( 18 )
(a)
3 exp( 18 ) + 6 + 2 exp( 29 )
3 exp( −1
8
)
(b)
3 exp( −1
8
) + 6 + 2 exp( −2
9
)
2 exp( −2
9
)
(c)
3 exp( 8 ) + 6 + 2 exp( −2
−1
9
)
6
(d)
3 exp( 32 ) + 6 + 2 exp( −1
−1
18
)
Solution:
Given that X ∼ Uniform{1, 2, 3} and Z ∼ Normal(1, 4) are independent.
Y = XZ + X
It implies that

Page 2
Y |X = 1 = Z + 1 ∼ Normal(2, 4)
Y |X = 2 = 2Z + 2 ∼ Normal(4, 16)
Y |X = 3 = 3Z + 3 ∼ Normal(6, 36)

Therefore,  
−(y−2)2
fY |X=1 (y) = √1 exp
2 2π 8
 
−(y−4)2
fY |X=2 (y) = √1 exp
4 2π 32
 
−(y−6)2
fY |X=3 (y) = √1 exp
6 2π 72

To find: fX|Y =2 (2).

fY |X=2 (2).fX (2)


fX|Y =2 (2) =
fY |X=2 (2).fX (2) + fY |X=1 (2).fX (1) + fY |X=3 (2).fX (3)

 
−(2−4)2
√1 exp . 13
4 2π 32
=      
−(2−4)2 −(2−2)2 −(2−6)2
√1 exp . 31 + √1 exp . 13 + √1 exp . 31
4 2π 32 2 2π 8 6 2π 72

exp −1 1

8 4
= 1
exp −1 −2
 1
+ 2 exp(0) + 16 exp

4 8 9

3 exp( −1
8
)
=
3 exp( 8 ) + 6 + 2 exp( −2
−1
9
)

3. The joint pdf of two continuous ranodm variables X and Y is given by


(
4xy 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
fXY (x, y) =
0 otherwise

Are X and Y independent?

1. Yes
2. No

Solution:
First we will calculate the marginal densities of X and Y .

Page 3
For 0 ≤ x ≤ 1
Z 1
fX (x) = fXY (x, y)dy
0
Z 1
= 4xydy
0
1
2
= 2xy
0
= 2x

For 0 ≤ y ≤ 1
Z 1
fY (y) = fXY (x, y)dx
0
Z 1
= 4xydx
0
1
2
= 2x y
0
= 2y

Therefore,
fX (x).fY (y) = 4xy = fXY (x, y)
It implies that X and Y are independent random variables.

4. Let (X, Y ) ∼ Uniform(D), where D = {(x, y) : (x − k)2 + (y − k)2 ≤ r}. Calculate


P (X ≥ Y ).
Solution:

y=x
2

x
1 2

Page 4
The region X ≥ Y will be the lower half part of the circle.

Therefore,
Area of lower half circle
P (X ≥ Y ) =
Area of the circle
π(1)2/2
=
π(1)2
1
=
2

5. Let (X, Y ) ∼ Uniform(D), where D = {(x, y) : y ≤ 2x, 0 < x < 1, 0 < y < 2} ∪ [1, 2] ×
[0, 2]. Find the marginal density of X.

(a) 
 2x + 2 0≤x≤2
fX (x) = 3 3
0 otherwise

(b) 
 2x + 1 0≤x≤2
fX (x) = 3 3
0 otherwise

(c) 
2x
3
 0≤x≤1
2
fX (x) = 1≤x≤2
3
0 otherwise

(d) 
2x
3
 0≤x≤1
1
fX (x) = 1≤x≤2
3
0 otherwise

Page 5
y

2 y = 2x

x
1 2

D denotes the area of the support(X, Y ).


Area of D = 12 × 1 × 2 + 1 × 2 = 3
Since (X, Y ) ∼ Uniform(D), it implies that
1
fXY (x, y) = , x, y ∈ D
3
R
We know that fX (x) = fXY (x, y)dy

For 0 < x < 1


Z 2x
1
fX (x) = dy
0 3
2x
1
= y
3
0
2x
=
3
For 1 < x < 2
Z 2
1
fX (x) = dy
0 3
2
1
= y
3
0
2
=
3
Therefore, marginal density of X is given by

2x
3 0≤x≤1

fX (x) = 23 1 ≤ x ≤ 2

0 otherwise

Page 6
6. The joint pdf of two random variables X and Y is given by
(
24xy 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, x + y ≤ 1
fXY (x, y) =
0 otherwise

Choose the correct option(s).


(a) P (X + Y ≤ 41 ) = 1
2
(b) P (X + Y ≤ 12 ) = 1
16
(c) X and Y are independent random variables.
(d) X and Y are dependent random variables.
Solution:
Option (a)

x+y =1

0.25

x
0.25 1

Orange region will denote X + Y x≤+ 14y. =


Now,
1
Z 1/4 Z 1/4−y
1
P (X + Y ≤ ) = fXY (x, y)dxdy
4 y=0 x=0

Z 1/4 Z 1/4−y

= 24xydxdy
y=0 x=0

1/4−y
Z 1/4

= 12x2 y dy
y=0
x=0

Page 7
Z 1/4  2
1
= 12y −y dy
y=0 4

Z 1/4
12
= y(1 − 4y)2 dy
y=0 16

Z 1/4
3
= y(1 + 16y 2 − 8y)dy
4 y=0

1/4
y2 8y 3
 
3
= + 4y 4 −
4 2 3
y=0

 
3 1 1 1
= + −
4 32 64 24
3 1 1
= . =
4 192 256

Hence, option (a) is wrong.

Option (b)

x + y = 0.5
0.5 x+y =1

x
0.5 1

Page 8
Orange region will denote X + Y ≤ 12 . Now,
Z 1/2 Z 1/2−y
1
P (X + Y ≤ ) = fXY (x, y)dxdy
2 y=0 x=0

Z 1/2 Z 1/2−y

= 24xydxdy
y=0 x=0

1/2−y
Z 1/2
2
= 12x y dy
y=0
x=0

Z 1/2  2
1
= 12y −y dy
y=0 2

Z 1/2
12
= y(1 − 2y)2 dy
y=0 4

Z 1/2

=3 y(1 + 4y 2 − 4y)dy
y=0

1/2
y2 4y 3
 
=3 + y4 −
2 3
y=0

 
1 1 1
=3 + −
8 16 6
2 1
=3× =
96 16

Hence, option (b) is correct.

Option (c) and (d)

Page 9
y

0.5 x+y =1

x
0.5 1

For 0 < x < 1


Z 1−x
fX (x) = fXY (x, y)dy
y=0
Z 1−x
= 24xydy
y=0
1−x
2
= 12xy
y=0
= 12x(1 − x)2

For 0 < y < 1


Z 1−y
fY (y) = fXY (x, y)dx
Zx=0
1−y
= 24xydx
0
1−y
2
= 12x y
x=0
= 12y(1 − y)2

Therefore, fX (x).fY (y) = 144xy(1 − x)2 (1 − y)2 6= fXY (x, y)

Hence, X and Y are not independent.

Page 10
7. The joint pdf of two random variables X and Y is given by
(
3xy(1 − x) 0 ≤ x ≤ 1, 0 ≤ y ≤ 2
fXY (x, y) =
0 otherwise

Calculate P (X > 12 |Y = 1).


Solution:
We know that
fXY (a < X < b, y)
P (a < X < b|Y = y) =
fY (y)
Now,
Z 1
fY (y) = 3xy(1 − x)dx
0
Z 1
= (3xy − 3x2 y)dx
0
1
3x2 y
 
= − x3 y
2
0
3y y
= −y =
2 2
1
Therefore, fY (1) = 2
Now,

1 fXY (X > 12 , Y = 1)
P (X > |Y = 1) =
2 fY (1)
1
= 2fXY (X > , Y = 1)
2
Z 1
= 2(3x(1 − x))dx
x= 12
Z 1
=6 (x − x2 )dx
1
2
1
x2 x3

=6 −
2 3 1
  2 
1 1 1 1 1 1
=6 − −6 − =1− =
2 3 8 24 2 2

8. The amount of milk (in litres) in a shop at the beginning of any day is a random amount
X from which a random amount Y (in litres) is sold during that day. Assume that the

Page 11
joint density function of X and Y is given by
(
1
0 ≤ x ≤ 10, 0 ≤ y ≤ x
fXY (x, y) = 50
0 otherwise
Find the probability that amount of milk left at the end of day is less than 5 litres. Write
your answer correct to two decimal points.
Solution:
y
y=x
10

5
x−y =5

x
5 10

X denotes the amount of milk at the beginning of any day and Y denotes the amount
of milk which is sold during that day.
Therefore, amount of milk left at the end of the day will be denoted by X − Y .

To find: P (X − Y < 5)

In the diagram above, brown region denotes X −Y < 5 and brown + blue region denotes
the support of X and Y .

1
Area of the support(X, Y ) = 2
× 10 × 10 = 50.

Area of brown region = Area of support(X, Y )− area of blue region

⇒ area of brown region = 50 − 12 × 5 × 5 = 75


2

Therefore,
area of brown region
P (X − Y < 5) =
area of support
75/2
=
50
75
=
100

Page 12
9. The joint pdf of two continuous random variables X and Y is given by
(
ke−(x+y) x ≥ 0, y ≥ 0
fXY (x, y) =
0 otherwise

Find the value of P (X ≥ 5, Y ≤ 5).

(a) e−10
(b) (e−5 − 1)e−5
(c) (1 − e−5 )e−5
(d) (e−5 + 1)e−5

Solution:
We know that Z Z
fXY dxdy = 1
Supp(X,Y )

Therefore,
Z ∞ Z ∞
(ke−(x+y) )dxdy = 1
y=0 x=0
Z ∞ Z ∞
⇒k e−y e−x dxdy = 1
y=0 x=0

Z ∞
−y −x
⇒k e (−e ) dy = 1
y=0
0
Z ∞
−y
⇒k e (0 + 1)dy = 1
Zy=0

⇒k e−y dy = 1
y=0

−y
⇒k(−e ) =1
0
⇒k(0 + 1) = 1
⇒k = 1

To find: P (X ≥ 5, Y ≤ 5)

Page 13
Now,
Z 5 Z ∞
P (X ≥ 5, Y ≤ 5) = (e−(x+y) )dxdy
y=0 x=5
Z 5 Z ∞
= e−y e−x dxdy
y=0 x=5

Z 5
= e−y (−e−x ) dy
y=0
5
Z 5
= e−y (0 + e−5 )dy
y=0
Z 5
−5
= (e ) e−y dy
y=0
5

= (e−5 )(−e−y )
0
= (e−5 )(−e−5 + 1)
= (e−5 )(1 − e−5 )

10. The joint pdf of two random variables X and Y is given by


(
1
(x + y) 0 ≤ x ≤ 2, 0 ≤ y ≤ 2
fXY (x, y) = 8
0 otherwise
 
1 1
Find the value of P ≤ y ≤ 1 | (X = ) . Write your answer correct to two decimal
2 2
points.
Solution:
We know that
fXY (X = x, a < Y < b)
P (a < Y < b|X = x) =
fX (x)
Now,
Z 2
1
fX (x) = (x + y)dy
0 8
2
y2

1
= xy +
8 2
0
2x + 2 x+1
= =
8 4

Therefore, fX ( 12 ) = 3
8

Page 14
Now,

1 1 fXY (X = 12 , 12 ≤ Y ≤ 1)
P ( ≤ Y ≤ 1|X = ) =
2 2 fX ( 12 )

Z 1   
8 1 1
= + y dy
1/2 3 8 2

Z 1  
1 1
= + y dy
1/2 3 2

1
y y2
 
= +
6 6 1/2

   
1 1 1 1
= + − +
6 6 12 24

1 1 5
= − = = 0.20
3 8 24

Page 15
Statistics for Data Science - 2

Week 5 Practice Assignment Solution

1. Let X ∼ Bernoulli(0.6). Let (Y | X = 0) ∼ Exp(1) and (Y | X = 1) ∼ Exp(3). Find


the marginal of Y.

a) 0.6e−y + 0.4e−3y
b) 0.4e−y + 0.6e−3y
c) 0.6e−y + 1.2e−3y
d) 0.4e−y + 1.8e−3y

Solution:
Given that, X ∼ Bernoulli(0.6), therefore pX (1) = 0.6 and pX (0) = 0.4.
The marginal density of Y is given by
X
fY (y) = pX (x)fY |X=x (y)
x∈TX

= pX (1)fY |X=1 (y) + pX (0)fY |X=0 (y)


= 0.6 × 3e−3y + 0.4e−y
= 1.8e−3y + 0.4e−y

2. Let X ∼ Uniform{1, 2, 3}. Let (Y | X = 1) ∼ Exp(1), (Y | X = 2) ∼ Exp(2) and


(Y | X = 3) ∼ Normal(0, 4). What is the marginal of Y ?
1 2
a) e−y + 2e−2y + √ e−y /8
2 2π
1 −y 1 2
b) [e + 2e−2y + √ e−y /8 ]
3 2 2π
1 1 2
c) [e−y + e−2y + √ e−y /4 ]
3 2π
1 2
d) e−y + e−2y + √ e−y /4
2 2π
Solution:
Given that, X ∼ Uniform{1, 2, 3}, therefore pX (1) = pX (2) = pX (3) = 13 .

1
The marginal density of Y is given by
X
fY (y) = pX (x)fY |X=x (y)
x∈TX

= pX (1)fY |X=1 (y) + pX (2)fY |X=2 (y) + pX (3)fY |X=3 (y)


2
1 1 1 e−y /8
= × e−y + × 2e−2y + × √
3 3 3 2 2π
1 1 2
= [e−y + 2e−2y + √ e−y /8 ]
3 2 2π

3. Let X ∼ Uniform{1, 2}. Let (Y | X = 1) ∼ Exp(2) and (Y | X = 2) ∼ Exp(4). Find


the value of fX|Y =3 (2).

2e−12
a)
e−6 + 2e−12
e−6
b) −6
e + 2e−12
e−12
c) −6
e + e−12
e−6
d) −6
e + e−12
Solution:
Given that, X ∼ Uniform{1, 2}, therefore pX (1) = pX (2) = 21 .
The marginal density of Y is given by
X
fY (y) = pX (x)fY |X=x (y)
x∈TX

= pX (1)fY |X=1 (y) + pX (2)fY |X=2 (y)


1 1
= × 2e−2y + × 4e−4y
2 2
= e−2y + 2e−4y

And
pX (2)fY |X=2 (3)
fX|Y =3 (2) =
fY (3)
1
2
× 4e−4×3
= −2×3
e + 2e−4×3
2e−12
= −6
e + 2e−12

2
4. The joint density function of two continuous random variables X and Y is given as
(
kxy 0 < x < 4, 0 < y < 1
fXY (x, y) =
0 otherwise

Find the value of k. Enter your answer correct to two decimals accuracy.
Solution: R∞ R∞
We know that for joint PDF, −∞ −∞ fXY (x, y)dxdy = 1
Since fXY (x, y) is nonzero in the region 0 < x < 4, 0 < y < 1.
Z 1Z 4
⇒ fXY (x, y)dxdy = 1
0 0
Z 1Z 4
⇒ kxy dxdy = 1
0 0
Z 1
y2 4
⇒ kx dx = 1
0 2 0
Z 1
⇒ 8kxdx = 1
0
x2 1
⇒ 8k =1
2 0
1
⇒k= = 0.25
4

5. Let (X, Y ) ∼ Uniform(D), where D = {(x, y) : x + y < 4, x > 0, y > 0}. Find the value
of P (2X + Y > 2).
1
a) 8
7
b) 8
3
c) 4
1
d) 4

Solution:

(X, Y ) ∼ Uniform(D), therefore


(
1
8
(x, y) ∈ D
fXY (x, y) =
0 otherwise

3
1
Area of the lower shaded region (A) will be 2
×1×2=1

P (2X + Y > 2) = 1 − P (2X + Y ≤ 2)


|A|
=1−
|D|
1
=1−
8
7
=
8

6. The joint density function of the random variables X and Y is given by


(
x + y 0 < x < 1, 0 < y < 1
fXY (x, y) =
0 otherwise

Find the value of P (X + Y < 1).


1
a) 3
2
b) 3
1
c) 6
3
d) 4

Solution:

4
Z 1 Z 1−y
P (X + Y < 1) = (x + y)dxdy
0 0
Z 1 2 
x 1−y
= + xy dy
0 2 0
Z 1
(1 − y)2

= + (1 − y)y dy
0 2
1
(1 − y)3 y 2 y 3

= − + −
6 2 3
0
   
1 1 1
= − − −
2 3 6
1
=
3
7. The joint PDF of two continuous random variables X and Y is given by
(
2
(5x + 2y) 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
fXY (x, y) = 7
0 otherwise
Find the marginal PDF of X.
a) (
2x 0 ≤ x ≤ 1
fX (x) =
0 otherwise
b) (
2
7
(5x + 1) 0 ≤ x ≤ 1
fX (x) =
0 otherwise
c) (
2
7
(3x + 2) 0 ≤ x ≤ 1
fX (x) =
0 otherwise

5
d) (
2
7
(5y + 1) 0 ≤ x ≤ 1
fX (x) =
0 otherwise

Solution:
For 0 ≤ x ≤ 1
Z 1
2
fX (x) = (5x + 2y)dy
0 7
1
2y 2

2
= 5xy +
7 2
0
2
= (5x + 1)
7

8. Let X and Y be jointly continuous random variables with joint PDF


(
k(2 − y) 0 < x < 4, 0 < y < 1
fXY (x, y) =
0 otherwise

Find the marginal PDF of Y.

a) (
3
2
y(2 − y) 0 < y < 1
fY (y) =
0 otherwise

b) (
2y 0<y<1
fY (y) =
0 otherwise

c) (
3
2
(1 − y2) 0 < y < 1
fY (y) =
0 otherwise

d) (
2
3
(2 − y) 0 < y < 1
fY (y) =
0 otherwise

Solution: R∞ R∞
We know that for joint PDF, −∞ −∞ fXY (x, y)dxdy = 1

6
Since fXY (x, y) is nonzero in the region 0 < x < 4, 0 < y < 1.
Z 1Z 4
⇒ fXY (x, y)dxdy = 1
0 0
Z 1Z 4
⇒ k(2 − y)dxdy = 1
0 0
Z 1 4
⇒ k(2 − y)x dy = 1
0 0
Z 1
⇒ 4k(2 − y)dy = 1
0
1
y2
 
⇒ 4k 2y − =1
2
0
3
⇒ 4k × = 1
2
1
⇒k=
6
For 0 < y < 1
Z 4
1
fY (y) = (2 − y)dx
0 6
1 4
= (2 − y)x
6 0
2
= (2 − y)
3

9. Let X and Y be two independent continuous random variables with PDFs fX (x) and
fY (y) given as
(
1 0≤x<1
fX (x) =
0 otherwise
(
y/2 0 ≤ y < 2
fY (y) =
0 otherwise
Find the value of P (2X + Y > 1).
1
a) 24
11
b) 12
1
c) 12
23
d) 24

7
Solution:
Given that X and Y be two independent continuous random variables,
therefore fXY (x, y) = fX (x)fY (y).
(
y/2 0 ≤ x < 1, 0 ≤ y < 2
fXY (x, y) =
0 otherwise
We have to find the value of P (2X + Y > 1).
And
P (2X + Y > 1) = 1 − P (2X + Y ≤ 1)
1−y
Z 1 Z
2 y
P (2X + Y ≤ 1) = dxdy
2
Z0 1 0
1−y
y 2
= x dy
0 2 0
Z 1
1
= y(1 − y)dy
0 4
1
1 y2 y3

= −
4 2 3
0
1
=
24
1 23
⇒ P (2X + Y > 1) = 1 − 24
= 24

10. The joint density function of two random variables X and Y is given by
(
8xy 0 ≤ x ≤ 1, 0 ≤ y ≤ x
fXY (x, y) =
0 otherwise

Are X and Y independent?

8
a) Yes
b) No

Solution:

Z x
fX (x) = 8xy dy
0
x
y2
= 8x
2
0
3
= 4x

Z 1
fY (y) = 8xy dx
0
1
x2
= 8y
2
0
= 4y

fX (x)fY (y) = 4x3 × 4y = 16x3 y 6= fXY (x, y).


Hence X and Y are not independent.

11. Let (X, Y ) ∼ Uniform(D), where D = [3, 5] × [2, 4]. Are X and Y independent?

a) Yes
b) No

Solution:
(X, Y ) ∼ Uniform(D), therefore

9
(
1
4
3 ≤ x ≤ 5, 2 ≤ y ≤ 4
fXY (x, y) =
0 otherwise

Z 4
1
fX (x) = dy
2 4
4
1
= y
4
2
1
=
2

Z 5
1
fY (y) = dx
3 4
5
1
= x
4
3
1
=
2
fX (x)fY (y) = 21 × 12 = 41 = fXY (x, y).
Hence X and Y are independent.

12. The joint PDF of two random variables X and Y is given by


(
4xy 0 < x < 1, 0 < y < 1
fXY (x, y) =
0 otherwise

Find the distribution of X | Y = 0.5. (fX|Y =0.5 (x))

a) (
2x 0 < x < 1
fX|Y =0.5 (x) =
0 otherwise

b) (
3x2 0<x<1
fX|Y =0.5 (x) =
0 otherwise

c) (
4x3 0<x<1
fX|Y =0.5 (x) =
0 otherwise

10
d) (
1 0<x<1
fX|Y =0.5 (x) =
0 otherwise
Solution:
For 0 < y < 1
Z 1
fY (y) = 4xy dx
0
1
x2
= 4y
2
0
= 2y
The distribution of X | Y = 0.5, (0 < x < 1) is given by

fXY (x, 0.5)


fX|Y =0.5 (x) =
fY (0.5)
4x × 0.5
=
2 × 0.5
= 2x

13. The joint PDF of two random variables X and Y is given by


(
x2 + xy
3
0 ≤ x ≤ 1, 0 ≤ y ≤ 2
fXY (x, y) =
0 otherwise

Find the value of P ( 14 < X < 1


2
| Y = 1).
83
a) 96
13
b) 96
13
c) 48
35
d) 48

Solution:
For 0 < y < 1
Z 1 xy 
fY (y) = x2 +
dx
0 3
 3 1
x x2 y
= +
3 6
0
1 1
= + y
3 6

11
fXY (x, 1)
fX|Y =1 (x) =
fY (1)
x + x×1
2
= 1 1 3
3
+ ×1
 6 x
= 2 x2 +
3

  Z 1/2 
1 1 x
P <X< |Y =1 = 2 x2 + dx
4 2 1/4 3
 3  1/2
x x2
=2 +
3 6
1/4
   
1 1 1 1
=2 + − +
24 24 192 96
 
1 1
=2 −
12 64
13
=
96

12
Statistics for Data Science - 2
Week 7 Graded assignment

Use the following values of standard normal distribution if needed.

FZ (0.71) = 0.76115, FZ (1.26) = 0.89617, FZ (1.58) = 0.94295, FZ (2.58) = 0.99506, FZ (1.96) =


0.975

1. Let X1 , X2 , X3 are three independent and identically distributed random variables with
mean µ and variance σ 2 . Given below are 3 different formulations of sample mean.
(Observe that E[A] = E[B] = E[C]).

X1 + X2 + X3
A=
3
B =0.1X1 + 0.3X2 + 0.6X3
C =0.2X1 + 0.3X2 + 0.5X3

Choose the correct option from the following:

(a) Var(A) = Var(B) = Var(C)


(b) Var(A) ≥ Var(B) ≥ Var(C)
(c) Var(A) ≤ Var(B) ≤ Var(C)
(d) Var(A) ≤ Var(C) ≤ Var(B)

Solution:
Let X1 , X2 , X3 ∼ i.i.d.X, where E[X] = µ, Var(X) = σ 2
 
X1 + X 2 + X3
Var(A) =Var
3
1
= (Var[X1 ] + Var[X2 ] + Var[X3 ])
9
1 σ2
= (3σ 2 ) =
9 3

Var(B) =Var (0.1X1 + 0.3X2 + 0.6X3 )


=0.01Var[X1 ] + 0.09Var[X2 ] + 0.36Var[X3 ]
=0.46(3σ 2 )
=1.38σ 2

1
Var(C) =Var (0.2X1 + 0.3X2 + 0.5X3 )
=0.04Var[X1 ] + 0.09Var[X2 ] + 0.25Var[X3 ]
=0.38(3σ 2 )
=1.14σ 2

Therefore, Var(B) ≥ Var(C) ≥ Var(A).

2. A random sample of size 25 is collected from a normal population with mean of 50 and
standard deviation of 5. Find the variance of the sample mean.
Solution:
We know that variance of the sample mean X is given by

σ2
Var[X] =
n
52
= =1
25

50
P
3. Let X1 , X2 , . . . , X50 ∼ i.i.d. Poisson(0.04) and let Y = Xi . Use Central Limit
i=1
theorem to find P (Y > 3). Enter the answer correct to 2 decimal places.
Solution:
Let X ∼ Poisson(0.04).
Consider the samples X1 , X2 , . . . , X50 from X.
E[X] = Var[X]
 50 =0.04  50 
P P
E[Y ] = E Xi = 50 × 0.04 = 2, Var[Y ] = Var Xi = 50 × 0.04 = 2
i=1 i=1

To find: P (Y > 3).


By CLT, we know that
Y − nµ
√ ∼ Normal(0, 1)
σ n
 
Y −2
⇒ √ ∼ Normal(0, 1)
2
Now,

P (Y > 3) = P (Y − 2 > 1)
 
Y −2 3−2
=P √ > √
2 2
= P (Z > 0.707)
= 1 − FZ (0.707) = 1 − 0.76 = 0.24

2
4. Let the moment generating function of a random variable X be given by
         
1 −2λ 1 3 −λ 3 2λ 7
MX (λ) = e + + e + e + eλ
4 40 10 40 20
Find the distribution of X.

X −2 −1 0 1 2
1 3 3 1 7
P (X = x) 4 40 10 40 20

(a)

X −2 −1 0 1 2
1 1 3 3 7
P (X = x) 4 40 10 40 20

(b)

X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 10 40 20 40

(c)

X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 40 40 20 10

(d)
Solution:
The MGF of a discrete random variable X with the PMF fX (x) = P (X = x), x ∈ TX
is given by
MX (λ) = E[eλX ]
X
= P (X = x)eλx
x∈TX

Now, MGF of a random variable X is given as


         
1 −2λ 1 3 −λ 3 2λ 7
MX (λ) = e + + e + e + eλ
4 40 10 40 20

Therefore, distribution of X is given by

X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 10 40 20 40

3
5. A fair coin is tossed 1000 times. Use CLT to compute the probability that head appears
at most 520 times. Enter the answer correct to 3 decimal places.
Solution:
Define a random variable X such that
(
1 if head appears on tossing a fair coin
X=
0 otherwise

1
Therefore, E[X] = µ = and
2
1 1 1
Var(X) = σ 2 = . =
2 2 4

Let X1 , X2 , . . . , X1000 be outcomes on tossing the fair coin 1000 times.


Notice that X1 + X2 + . . . + X1000 will denote the number of times head appears in
1000 tosses.
Let S = X1 + X2 + . . . + X1000
To find: P (S ≤ 520)
By CLT, we know that
S − 1000µ
√ ∼ Normal(0, 1)
σ n
S − 500
⇒ √ ∼ Normal(0, 1)
5 10
Now,

P (S ≤ 520) = P (S − 500 ≤ 20)


 
S − 500 20
=P √ ≤ √
5 10 5 10
= P (Z ≤ 1.26)
= 0.896

6. A fair die is rolled 100 times. Let X denote the number of times six is obtained. Find
X 1
a bound for the probability that differs from by less than 0.1 using weak law of
100 6
large numbers.
5
(a) at least
36
31
(b) at least
36

4
5
(c) at most
36
31
(d) at most
36
Solution:
X denotes the number of times six is obtained on rolling a fair die 100 times.
Let X1 , X2 , . . . , X100 be 100 i.i.d. samples such that
(
1 if six appears on rolling a fair die
Xi =
0 otherwise
1
E[Xi ] = µ = and
6
5
Var(Xi ) = σ 2 =
36
Notice that X = X1 + X2 + X3 + . . . + X100
!
X 1
To find: Bound on P − < 0.1 .
100 6

By weak law of large numbers, we have


σ2
P (|X − µ| < δ) ≥ 1 − 2
! nδ
X 1 5
⇒P − < 0.1 ≥ 1 −
100 6 36 × 100 × 0.01
!
X 1 5 31
⇒P − < 0.1 ≥ 1 − =
100 6 36 36

7. Let X1 , X2 , . . . , X500 ∼ i.i.d Normal(0, 1). Evaluate P (X12 + X22 + . . . + X500


2
> 550)
using Central Limit theorem. Enter the answer correct to 2 decimal places.
Hint: (X12 + X22 + . . . + X500 2
) ∼ Gamma (250, 0.5) .
Solution:
Given X1 , . . . , X500 ∼ i.i.d. Normal(0, 1).  
2 1 1
We know that if X ∼ Normal(0, 1) =⇒ X ∼ Gamma ,
2 2
Also, Sum of n independent  Gamma(α,
 β) is Gamma(nα, β).
1 1
Therefore, Xi2 ∼ Gamma , , for all i.
2 2
and (X12 + X22 + . . . + X500
2
) ∼ Gamma (250, 0.5)

5
Let Y = Y1 + Y2 + . . . + Y500 , where Yi = Xi2 for all i : 1 → 500

0.5 0.5
E[Yi ] = = 1 and Var[Yi ] = = 2, for i : 1 → 500
0.5 0.25
250 250
E[Y ] = = 500 and Var[Y ] = = 1000
0.5 0.52

To find: P (Y > 550)


By CLT, we know that
Y − 500µ
√ ∼ Normal(0, 1)
σ n
Y − 500
⇒ √ ∼ Normal(0, 1)
10 10
Now,

P (Y > 550) = P (Y − 500 > 50)


 
Y − 550 5
=P √ >√
10 10 10
= P (Z > 1.58)
= 1 − FZ (1.58) = 1 − 0.94 = 0.06

Use the below information to answer questions 8 and 9.

Let X be a random variable having the gamma distribution with the parameters α = 2n
and β = 1.
Hint:
α α
• If X ∼ Gamma(α, β), E[X] = and Var[X] = 2
β β
• Sum of n independent Gamma(α, β) is Gamma(nα, β)

8. Use the Weak Law of Large number to find the value of n such that
!
X
P − 1 > 0.01 < 0.01
2n

(a) 505000
(b) 470000
(c) 498000
(d) 482000

6
Solution:
Given X ∼ Gamma(2n, 1)
Let X = X1 + X2 + X3 + . . . + X2n , where Xi ∼ Gamma(1, 1).

E[X] = µ = 1 and Var(X) = σ 2 = 1


1
E[X̄] = 1 and Var[X̄] =
2n
!
X
To find: The value of n such that P − 1 > 0.01 < 0.01.
2n

By weak law of large numbers, we have

σ2
P (|X − µ| > δ) ≤ 2

!
X 1
⇒P − 1 > 0.01 ≤
2n 2n × 0.012

1 1
Therefore, 2
< 0.01 =⇒ 2n > =⇒ n > 500000.
2n × 0.01 0.013

9. Use CLT to find the value of n such that


!
X
P − 1 > 0.01 < 0.01
2n

Hint: Use FZ (2.58) = 0.995, FZ (1.96) = 0.975 if needed.

(a) 34570
(b) 33500
(c) 32500
(d) 30000

Solution:
E[X1 + . . . + X2n ] = 2n and Var[X1 + . . . + X2n ] = 2n
!
X
To find: The value of n such that P − 1 > 0.01 < 0.01.
2n

By CLT, we know that


X − 2nµ
√ ∼ Normal(0, 1)
σ n

7
X − 2n
⇒ √ ∼ Normal(0, 1)
2n
Now,
!
X
P − 1 > 0.01 < 0.01
2n
!
X1 + . . . + Xn
=⇒ P − 1 > 0.01 < 0.01
2n
!
X1 + . . . + Xn − 2n √
=⇒ P √ > 0.01 2n < 0.01
2n

=⇒ P (| Z |> 0.01 2n) < 0.01

=⇒ 2P (Z > 0.01 2n) < 0.01
√ 0.01
=⇒ 1 − FZ (0.01 2n) <
√ 2
=⇒ FZ (0.01 2n) > 0.995

=⇒ FZ (0.01 2n) > FZ (2.58)
=⇒ n > 33282

10. Let the time taken (in hours) for failure of an electric bulb follow the exponential distri-
bution with the parameter 0.05. Suppose that 100 such light bulbs say L1 , L2 , . . . , L100
are used in the following manner: For every i, as soon as the light Li fails, Li+1 be-
comes operative, where i : 1 → 99 (i.e. If L1 fails, L2 becomes operative, if L2 fails, L3
becomes operative, and so on). Let the total time of operation of 100 bulbs be denoted
by T. Using CLT, compute the probability that T exceeds 2500 hours.
(a) FZ (1.5)
(b) 1 − FZ (1.5)
(c) FZ (2.5)
(d) 1 − FZ (2.5)
Solution:
Given, time to failure (in hours) of an electric bulb has the exponential distribution
with the parameter λ = 0.05.
Since, the bulbs are used in such a way, that as soon as light L1 fails, L2 becomes
operative, L2 fails, L3 becomes operative, and so on.
We know that if X ∼ Gamma(α, β) with parameter α = 1, then X ∼ Exp(β).
Also, sum of n i.i.d. Exp(λ) is Gamma(n, λ).

8
Since each of the Li ’s are exponentially distributed with parameter = 0.05, therefore

L1 + . . . + L100 ∼ Gamma(nα, β) = Gamma(100, 0.05)

Let T = L1 + . . . + L100

1 1
E[Li ] = µ = = 20 and SD[Li ] = σ = = 20
0.05 0.05

To find: P (T ≥ 2500)
By CLT, we know that
T − 100µ
√ ∼ Normal(0, 1)
σ n
T − 2000
⇒ √ ∼ Normal(0, 1)
20 100
Now,

P (T ≥ 2500) = P (T − 2000 ≥ 500)


 
T − 2000 500
=P ≥
200 200
= P (Z ≥ 2.5)
= 1 − FZ (2.5)

11. Suppose speeds of vehicles on a particular road are normally distributed with mean 36
mph and standard deviation 2 mph. Find the probability that the mean speed X of
20 randomly selected vehicles is between 35 and 38 mph.
√ √
(a) FZ ( 5) − FZ (− 5)
√ √
(b) FZ ( 20) − FZ (− 20)
√ √
(c) FZ ( 38) − FZ (− 35)
√ √
(d) FZ ( 20) − FZ (− 5)

Solution:
Let X denote the speed of a vehicle on a particular road.
Given that X ∼ Normal(36, 22 ).
Therefore, µ = 36 and σ = 2
Select X1 , X2 , . . . X20 samples such that X1 , X2 , . . . X20 ∼ iid X

X1 + X2 + . . . + X20
Let X = and S = X1 + X2 + . . . + X20
20

9
To find: P (35 < X < 38) From CLT, we know that

X1 + X2 . . . + Xn − nE[X]
√ ∼ Normal(0, 1)

S − nµ
⇒ √ ∼ Normal(0, 1)

(S − 36(20))
⇒ √ ∼ Normal(0, 1)
(2 20)

Now,
S
P (35 < X < 38) = P (35 < < 38)
20
S
= P (−1 < − 36 < 2)
20
S − 36(20)
= P (−1 < < 2)
√ 20
− 20 S − 36(20) √
= P( < √ < 20)
2 2 20
√ S − 36(20) √
= P (− 5 < √ < 20)
2 20
√ √
= FZ ( 20) − FZ (− 5)

10
Statistics for Data Science - 2
Week 7 practice Assignment
Statistics from samples and Limit theorems

X
1. If X, Y ∼ i.i.d. Normal(0, 4), what will be the variance of ?
Y
(a) 4
(b) 2
(c) 1
(d) Undefined

Solution:
X
We know that if X, Y ∼ i.i.d. Normal(0, σ 2 ), ∼ Cauchy(0, 1) and variance of Cauchy
Y
distribution is undefined.
Therefore, option(d) is correct.

2. A population has mean 60 and standard deviation 6. Random samples of size 100 from
this population are collected independently. Find the expected value of the sample mean.
Solution:
We know that expected value of the sample mean X is given by

E[X] = µ
= 60

3. Let X1 , X2 , X3 , X4 and X5 ∼ i.i.d. Normal(2, 25). Calculate P (2X1 + X2 + 3X3 + X4 +


X5 ≥ 10).

1. FZ (0.3)
2. 1 − FZ (0.3)
3. FZ (−0.3)
4. 1 − FZ (−0.3)

Solution:

We know that linear combination of independent Normal distributions is again a normal


distribution.
Hence, 2X1 + X2 + 3X3 + X4 + X5 will follow a Normal distribution.
Let Y = 2X1 + X2 + 3X3 + X4 + X5
E[Y ] = E[2X1 + X2 + 3X3 + X4 + X5 ] = (2 + 1 + 3 + 1 + 1)E[X] = 16
Var(Y ) = Var(2X1 + X2 + 3X3 + X4 + X5 ) = (4 + 1 + 9 + 1 + 1)Var(X) = 400

It implies that Y ∼ Normal(16, 202 ).


To find: P (Y ≥ 10)

Now,

P (Y ≥ 10) = P (Y − 16 ≥ −6)
Y − 16 −6
= P( ≥ )
20 20
Y − 16
= P( ≥ −0.3)
20
= P (Z ≥ −0.3)
= 1 − P (Z < −0.3)
= 1 − FZ (−0.3)

4. Random samples of size 100 are collected from a population of unknown parameters. If
the variance of the sample mean is 36, what will be the standard deviation of the actual
population?
Solution:
σ2
We know that variance of the sample mean is given by where σ is the standard
n
deviation of the actual population and n is the sample size.

By the given information, we have

σ2
= 36
n
σ2
⇒ = 36
100
⇒σ 2 = 3600
⇒σ = 60

Therefore, standard deviation of the actual population is 60.

5. A random sample of size 50 is collected from a population with a standard deviation of


5. Find the upper bound on the probability that the sample mean will be at least 10
away from the actual mean using the weak law of large numbers. Write your answer
correct to three decimal places.

Page 2
Solution:
Given: standard deviation of the population, σ = 5
Sample size, n = 50

To find: upper bound on P (|X − µ| ≥ 10) where X and µ are sample mean and popu-
lation mean, respectively.

Now, by weak law of large number, we have

σ2
P (|X − µ| ≥ δ) ≤
nδ 2
25
⇒P (|X − µ| ≥ 10) ≤
100 × 50
⇒P (|X − µ| ≥ 10) ≤ 0.005

6. A study shows that the average daily sleeping hours of teenagers is ten hours with a
standard deviation of two hours. If a sample of 100 teenagers is collected, what will be
the probability that the mean of the sleeping hours of these 100 teenagers is at least 0.4
hours away from the population mean? Assume that each observation in the sample is
independent. Assume that FZ denotes the CDF of standard normal distribution.

(a) 1 + FZ (−2) − FZ (2)


(b) 1 − FZ (−2) + FZ (2)
(c) FZ (2) − FZ (−2)
(d) FZ (2)

Solution:
let X denote the average daily sleeping hours of teenagers.
Given: standard deviation of X, σ = 2
Sample size, n = 100

To find: P (|X − µ| ≥ 0.4) where X and µ are sample mean and population mean,
respectively.

Let S = X1 + X2 + . . . X100 where Xi denotes the ith sample.


S − nµ S − 100µ
By CLT, we know that √ ∼ Normal(0, 1) ⇒ ∼ Z (Standard Normal)
σ n 20

Page 3
Now,

S
P (|X − µ| ≥ 0.4) = P ( − µ ≥ 0.4)
n
S − nµ
= P( ≥ 0.4)
n

S − nµ 0.4 n
= P( √ ≥ )
σ n σ
= P (|Z| ≥ 2)
= P (Z ≥ 2) + P (Z ≤ −2)
= 1 − P (Z ≤ 2) + P (Z ≤ −2)
= 1 − FZ (2) + FZ (−2)

7. What is the fourth moment of the Normal(0, 4) distribution?


Solution:
λ2 X 2 λ3 X 3
MX (λ) = E[eλX ] = E[1 + λX + + + ...]
2! 3!
λ2 E[X 2 ] λ3 E[X 3 ]
= 1 + λE[X] + + + ...
2! 3!
In the moment generating function, coefficient of λ will give first moment (E[X]), co-
λ2 λk
efficient of will give the second moment (E[X 2 ]) and similarly, coefficient of will
2! k!
give the kth moment (E[X k ]).

2 σ2
Moment generating function of Normal(0, σ 2 ) is given by eλ /2
.
Let N ∼ Normal(0, 22 )
λ2 22/2
MN (λ) = e
λ2 22 λ4 24
=1+ + + ...
2 2!(4)
λ2 22 λ2
=1+ + 48 + . . .
2 4!

λ4
Therefore, 4th moment of Normal(0, 22 ) = coefficient of = 48
4!

8. Let X ∼ Gamma(2, 12 ) and Y ∼ Gamma(5, 21 ) be two independent random variables.


X
What will be the expected value of ? Write your answer correct to two decimal
X +Y

Page 4
places.
Solution:
We know that if X ∼ Gamma(α, k) and Y ∼ Gamma(β, k) be two independent random
X
variables, then ∼ Beta(α, β).
X +Y

Given that X ∼ Gamma(2, 12 ) and Y ∼ Gamma(5, 12 ) are two independent random


variables. It implies that
X
∼ Beta(2, 5)
X +Y
 
X 2
Therefore, E = = 0.28
X +Y 2+5

9. A study says that the delivery time of pizzas has a standard deviation of 10 minutes. A
pizza shop collected the data of some deliveries and their
√ delivery time. The probability
that the mean delivery time of this sample is at least 5 minutes away from the actual
mean delivery time is at most 51 as per the weak law of large numbers. What is the size
of the sample?
Solution:
Let X denote the delivery time of pizzas.
Given that σ = 10 √
To find: size of the sample such that P (|X − µ| ≥ 5) ≤ 15 ...(1).
By the weak law of large numbers, we have

σ2
P (|X − µ| ≥ δ) ≤ 2

√ 100
⇒P (|X − µ| ≥ 5) ≤ ...(1)
n×5

By equation (1) and (2), we have


1 100
=
5 5n
⇒n = 100

10. A company sells eggs whose weights are normally distributed with a mean of 70g and a
standard deviation of 2g. Suppose that these eggs are sold in packages that each contain
four eggs. Assume that the weight of each egg is independent. What is the probability
that the mean weight of the four eggs in a package is greater than 68.5g? Write your
answer correct to two decimal places.
(Hint: Use the fact that linear combination of normal distributions is again a normal
distribution. FZ (−1.5) = 0.066)

Page 5
Solution:
Let X denote the weight of an egg.
Given that E[X] = µ = 70
SD(X) = σ = 2
X ∼ Normal(70, 22 ) Let X1 , X2 , X3 and X4 denote the weights of four eggs in a package.

Suppose that
X1 + X2 + X3 + X4
X=
4

To find: P (X > 68.5)

We know that linear combination of independent Normal distribution is again a Normal


distribution.
It implies that X is a Normal distribution.

E[X] = µ = 70 and
σ2 4
Var(X) = = =1
n 4

It implies that X ∼ Normal(70, 1) ⇒ X − 70 ∼ Normal(0, 1)

Now,

P (X > 68.5) = P (X − 70 > −1.5)


= P (Z > −1.5)
= 1 − FZ (−1.5)
= 1 − 0.066 = 0.93

11. Let X1 , X2 , X3 , . . . Xn be i.i.d. Poisson(4). What should be the value of n such that
P (3.8 ≤ X ≤ 4.2) ≥ 0.95? [2 marks]
(Hint: Use FZ (1.96) = 0.975)

1. at least 200
2. at least 385
3. at least 450
4. at least 585

Solution:
Given that X1 , X2 , X3 , . . . Xn ∼ i.i.d. Poisson(4)

Page 6
Mean of the distribution = µ = 4
Variance of the distribution = σ 2 = 4
Let S = X1 + X2 + . . . + Xn and
X1 + X2 + . . . + Xn
X=
n

To find: value of n such that P (3.8 ≤ X ≤ 4.2) ≥ 0.95


By CLT, we know that
S − nµ
√ ∼ Normal(0, 1)

S − 4n
⇒ √ ∼ Normal(0, 1) ...(1)
2 n

P (3.8 ≤ X ≤ 4.2) ≥ 0.95


S
⇒P (3.8 ≤ ≤ 4.2) ≥ 0.95
n
S
⇒P (−0.2 ≤ − 4 ≤ 0.2) ≥ 0.95
n
S − 4n
⇒P (−0.2 ≤ ≤ 0.2) ≥ 0.95
n
S − 4n
⇒P (−0.1 ≤ ≤ 0.1) ≥ 0.95
2n
√ S − 2n √
⇒P (−0.1 n ≤ √ ≤ 0.1 n) ≥ 0.95
2 n
√ √
⇒FZ (0.1 n) − FZ (−0.1 n) ≥ 0.95
√ √
⇒FZ (0.1 n) − (1 − FZ (0.1 n)) ≥ 0.95

⇒2FZ (0.1 n) − 1 ≥ 0.95

⇒Fz (0.1 n) ≥ 0.975

⇒0.1 n ≥ 1.96
⇒n ≥ 384.16

12. Let the moment generating function of a random variable X be given by


         
1 −4λ 1 −2λ 1 2λ 1 4λ 5
MX (λ) = e + e + e + e +
8 6 6 8 12
Find the distribution of X. [1 mark]

X −4 −2 0 2 4
1 1 1 1 5
P (X = x) 8 6 6 8 12

1.

Page 7
X −4 −2 0 2 4
5 1 1 1 1
P (X = x) 12 8 6 6 8

2.

X −4 −2 0 2 4
1 1 5 1 1
P (X = x) 8 6 12 6 8

3.

X −4 −2 0 2 4
1 1 5 1 1
P (X = x) 8 6 12 8 6

4.

Solution:
The MGF of a discrete random variable X with the PMF fX (x) = P (X = x), x ∈ TX
is given by

MX (λ) = E[eλX ]
X
= P (X = x).eλx
x∈TX

Now, MGF of a random variable X be given by


         
1 −4λ 1 −2λ 1 2λ 1 4λ 5
MX (λ) = e + e + e + e +
8 6 6 8 12

Therefore, distribution of X is given by

X −4 −2 0 2 4
1 1 5 1 1
P (X = x) 8 6 12 6 8

13. A fair die is rolled 3600 times. Use CLT to compute the probability that six appears at
most 630 times. Enter the answer correct to two decimal places.
(Hint: Use FZ (1.341) = 0.91)
Solution:
Define a random variable X such that
(
1 if six appears on rolling a fair die
X=
0 otherwise

Page 8
1
Therefore, E[X] = µ = and
6
1 5 5
Var(X) = σ 2 = . =
6 6 36

Let X1 , X2 , . . . , X3600 be outcomes on rolling the fair die 3600 times.


Notice that X1 +X2 +. . .+X3600 will denote the number of times six appears in 3600 rolls.

Let S = X1 + X2 + . . . + X3600

To find: P (S ≤ 630)

By CLT, we know that


S − 3600µ
√ ∼ Normal(0, 1)
σ n
S − 600
⇒ √ ∼ Normal(0, 1)
10 5
Now,
P (S ≤ 630) = P (S − 600 ≤ 30)
S − 600 30
= P( √ ≤ √ )
10 5 10 5
= P (Z ≤ 1.34)
= 0.91

14. A fair die is rolled 1000 times. Let X denote the number of times six is obtained. Find
X 1
a bound for the probability that differs from by more than 0.2 using weak law
1000 6
of large numbers.
5
1. at least
1440
1436
2. at least
1440
5
3. at most
1440
1436
4. at most
1440
Solution:
X denotes the number of times six is obtained on rolling the die 1000 times.
Let X1 , X2 , . . . , X1000 be 1000 i.i.d. samples such that
(
1 if six appears on rolling a fair die
Xi =
0 otherwise

Page 9
1
E[Xi ] = µ = and
6
5
Var(Xi ) = σ 2 =
36
Notice that X = X1 + X2 + X3 + . . . + X1000
!
X 1
To find: Bound on P − > 0.2 .
1000 6

By weak law of large numbers, we have


σ2
P (|X − µ| > δ) ≤ 2
nδ!
X 1 5
⇒P − > 0.2 ≤
1000 6 36 × 1000 × 0.04
!
X 1 5
⇒P − > 0.2 ≤
1000 6 1440

15. Consider the following PDF curves and match them with the correct distribution. [1
mark]

Graph 1 Graph 2

Graph 3 Graph 4

Page 10
(a) Graph 1 → Gamma, Graph 2 → Normal, Graph 3 → Gamma, Graph 4 → Beta.
(b) Graph 1 → Beta, Graph 2 → Gamma, Graph 3 → Normal, Graph 4 → Gamma.
(c) Graph 1 → Beta, Graph 2 → Normal, Graph 3 → Normal, Graph 4 → Gamma.
(d) Graph 1 → Gamma, Graph 2 → Normal, Graph 3 → Normal, Graph 4 → Beta.

Solution:
Graph 1: Range of the distribution is [0, 1] and shape of the graph resembles to the Beta
distribution.

Graph 2: PDF curve is not symmetric about mean and shape of the graph resembles to
the Gamma distribution.

Graph 3: PDF curve is symmetric about mean and shape of the graph resembles to the
Normal distribution.

Graph 4: PDF curve is not symmetric about mean and shape of the graph resembles to
the Gamma distribution.
Therefore, Graph 1 → Beta, Graph 2 → Gamma, Graph 3 → Normal, Graph 4 →
Gamma.

16. Let X1 , X2 and X3 ∼ i.i.d. X where X has the following probability mass function:

x -1 2
2 1
fX (x) 3 3

Table 7.1.P: PMF of X

Find the distribution of Y = X1 + X2 + X3 . [1 mark]

Y -3 0 3 6
(a) 1 1 1 1
P (Y = y) 6 6 3 3

Y -3 0 3 6
(b) 8 4 2 1
P (Y = y) 27 9 9 27

Y -3 0 3 6
(c) 8 1 4 2
P (Y = y) 27 27 9 9

Y -3 0 3 6
(d) 2 8 1 4
P (Y = y) 9 27 27 9

Page 11
Solution:
The PMF of X is given by

x -1 2
2 1
fX (x) 3 3

Given that Y = X1 + X2 + X3 where X1 , X2 and X3 ∼ i.i.d. X.


To find: Distribution of Y .

We will find the distribution of X by finding the MGF of Y .

MY (λ) = E[eλY ]
= E[eλ(X1 +X2 +X3 ) ]
= E[eλX1 eλX2 eλX3 ]
= E[eλX1 ]E[eλX2 ]E[eλX3 ] (Since, X1 , X2 and X3 are independent)
λX λX λX
= E[e ]E[e ]E[e ] (Since, X1 , X2 and X3 ∼ i.i.d. X)
= [MX (λ)]3 ...(1)

Now,
MX (λ) = E[eλX ]
= e−1λ .P (X = −1) + e2λ .P (X = 2)
2e−λ e2λ
= + ...(2)
3 3
From equation (1) and (2), we have

3
2e−λ e2λ

MY (λ) = +
3 3
1
= (2e−λ + e2λ )3
27
1
= (8e−3λ + e6λ + 12e−2λ e2λ + 6e−λ e4λ ) (since, (a + b)3 = a3 + b3 + 3a2 b + 3ab2 )
27
8 1 4 2
= e−3λ + e6λ + + e3λ
27 27 9 9
Therefore, distribution of Y is given by

Y -3 0 3 6
8 4 2 1
P (Y = y) 27 9 9 27

Page 12
Statistics for Data Science - 2

Week 8 Graded Assignment Solution

1. Let X1 , X2 , . . . , Xn be i.i.d.
 samples from a distribution
 X with mean µ and standard
X1 + X2 + . . . + Xn
deviation σ. Let µ̂ = 6 be an estimator of µ.
n
i) Is the estimator unbiased?

a) Yes
b) No

Solution:
  
X1 + X2 + . . . + Xn
E[µ̂] = E 6
n
6
= (nµ)
n
= 6µ

And
Bias(µ̂, µ) = E[µ̂] − µ = 6µ − µ = 5µ
Since, Bias(µ̂, µ) 6= 0, therefore the estimator is not unbiased.
ii) Find the risk of µ̂.
36σ 2
(a) + 25µ2
n
36σ 2
(b) + 5µ
n
6σ 2
(c) + 25µ2
n
6σ 2
(d) + 5µ
n
Solution:
  
X1 + X2 + . . . + Xn
Var(µ̂) = Var 6
n
36
= 2 (nσ 2 )
n
36σ 2
=
n

1
Risk(µ̂) = Bias(µ̂, µ)2 + Var(µ̂)
36σ 2
= (5µ)2 +
n
2
36σ
= 25µ2 +
n
2. Consider a sample of iid random variables X1 , X2 , . . . , Xn , where n > 20, E[Xi ] =
1 n
µ, Var(Xi ) = σ 2 and the estimator of µ, µ̂n =
P
Xi . Find the MSE of µ̂n .
n − 20 i=21
σ
a) n−20
σ2
b) n−20
σ2
c) n−21
σ
d) n

Solution:
" n
#
1 X
E[µ̂n ] = E Xi
n − 20 i=21
(n − 20)µ
=
n − 20

This implies that
Bias(µ̂n , µ) = E[µ̂n ] − µ = µ − µ = 0

" n
#
1 X
Var(µ̂n ) = Var Xi
n − 20 i=21
n
1 X
= Var(Xi )
(n − 20)2 i=21
1
= [(n − 20)σ 2 ]
(n − 20)2
σ2
=
(n − 20)

Risk(µ̂) = Bias(µ̂, µ)2 + Var(µ̂)


σ2
=0+
(n − 20)
σ2
=
(n − 20)

2
3. Let X1 , X2 , . . . , Xn ∼ iid X, where X is a random variable with density function
(
θ
θ+1 , x > 1,
fX (x) = x
0, otherwise.
θ
The mean of the random variable X is θ−1
. Find an estimator of θ using method of
moments.
X1 + X2 + . . . + Xn
(a)
X1 + X2 + . . . + Xn − 1
X1 + X2 + . . . + Xn
(b)
1 − X1 + X2 + . . . + Xn
X1 + X 2 + . . . + Xn
(c)
X1 + X2 + . . . + Xn − n
X1 + X 2 + . . . + Xn
(d)
n − X1 + X2 + . . . + Xn
θ
Solution: The mean of the random variable X is θ−1
.
So,
θ
M1 =
θ−1
⇒ M1 θ − M1 = θ
M1
⇒θ=
M1 − 1
X1 +X2 +...+Xn
n
⇒θ= X1 +X2 +...+Xn
n
1 −
X 1 + X2 + . . . + X n
⇒θ=
X1 + X2 + . . . + Xn − n
X1 + X 2 + . . . + Xn
Therefore the estimator of θ is .
X1 + X2 + . . . + Xn − n
4. Let X1 , X2 , X3 ∼ iid Binomial(4, θ). Given a random sample (1, 4, 2), find the maximum
likelihood estimate of θ.
2
a) 3
7
b) 12
1
c) 3
5
d) 12

Solution: Xi ∼ Binomial(4, θ)
⇒ fXi (x) = 4 Cx θx (1 − θ)4−x

3
Likelihood function is given by
3
Q
L(x1 , x2 , x3 ) = fXi (xi )
i=1
⇒ L(x1 , x2 , x3 ) = 4 Cx1 θx1 (1 − θ)4−x1 × 4 Cx2 θx2 (1 − θ)4−x2 × 4 Cx3 θx3 (1 − θ)4−x3

L(1, 4, 2) = 4 C1 4 C4 4 C2 θ(1+4+2) (1 − θ)12−(1+4+2)


= 24θ7 (1 − θ)5
⇒ log(L(1, 4, 2)) = log(24) + 7 log(θ) + 5 log(1 − θ)

Therefore, ML estimator for θ is given by


θ̂ = arg maxθ [log(24) + 7 log(θ) + 5 log(1 − θ)]

Let Y = log(24) + 7 log(θ) + 5 log(1 − θ)


dY 7 5
⇒ = −
dθ θ 1−θ
Now we will equate this value to zero and find the value of θ

7 5 7
− =0⇒θ=
θ 1−θ 12
7
⇒ θ̂M L =
12

5. Let X1 , X2 , . . . , Xn ∼ iid X, where X is a random variable with density function


(
e−(x−θ) , x > θ,
fX (x) =
0, otherwise.

i) The mean of the distribution is θ + 1. Find the estimator of θ using method of


moments. [1 mark]
X1 + X2 + . . . + Xn
(a)
n
X1 + X2 + . . . + Xn − n
(b)
n
n
(c)
X1 + X2 + . . . + Xn − n
1
(d)
n − X1 + X2 + . . . + Xn

4
Solution: The mean of the random variable X is θ + 1.
So,

M1 = θ + 1
⇒ θ = M1 − 1
X1 + X2 + . . . + Xn
⇒θ= −1
n
X1 + X2 + . . . + Xn − n
⇒θ=
n
X 1 + X2 + . . . + Xn − n
Therefore the estimator of θ is .
n
ii) Is the method of moments estimator unbiased?

a) Yes
b) No

Solution:
Estimator of θ is
X1 + X 2 + . . . + Xn − n
θ̂ =
n
X1 + X 2 + . . . + Xn − n
E[θ̂] = E[ ]
n
1
= (E[X1 ] + E[X2 ] + . . . + E[Xn ] − n)
n
1
= (nθ + n − n)
n

And
Bias(θ̂, θ) = E[θ̂] − θ = θ − θ = 0
Since, Bias(θ̂, θ) = 0, therefore the estimator is unbiased.

6. Suppose it is known that a sample consisting of the values 10, 12, 15, 16.5, 18, 19, 20
and 21.5 comes from a population with the density function
( −x
1 θ
e , x > 0,
f (x) = θ
0, otherwise.

Find the maximum likelihood estimate of θ. Enter your answer correct to one decimal.

5
Solution:
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi )
i=1
n
1 −xi
Y
= e θ
i=1
θ
1  −x1 −x2 −xn

= n e θ e θ ...e θ
θ
1  −(x1 +x2 +...+xn ) 
= n e θ
θ
(x1 + x2 + . . . + xn )
⇒ log(L(x1 , x2 , . . . , xn )) = −n log(θ) −
θ
Therefore, ML estimator for θ is given by
(x1 + x2 + . . . + xn )
θ̂ = arg maxθ [−n log(θ) − ]
θ
(x1 + x2 + . . . + xn )
Let Y = −n log(θ) −
θ
dY n (x1 + x2 + . . . + xn )
⇒ =− +
dθ θ θ2
Now we will equate this value to zero and find the value of θ.
n (x1 + x2 + . . . + xn )
⇒− + =0
θ θ2
x1 + x2 + . . . + xn
⇒θ=
n
x1 + x2 + . . . + xn
⇒ θ̂ =
n
Therefore, maximum likelihood estimate of θ for the given sample will be

10 + 12 + 15 + 16.5 + 18 + 19 + 20 + 21.5
θ̂ =
8
132
=
8
= 16.5

7. Let X be a discrete random variable with the following probability mass function

x 1 2 3 4
1−p p 1−p p
fX (x)
2 2 2 2
Table 8.1.G: PMF of X

6
Suppose a sample consisting of the values 2, 2, 4, 3, 1, 3, 1 and 2 is taken from the random
variable X. Find the estimate of p using method of moments. Enter your answer correct
to two decimals accuracy.
Solution:
1−p p 1−p p
E[X] = 1 × +2× +3× +4×
2 2 2 2
(1 − p) + 2p + 3(1 − p) + 4p
=
2
=p+2

Now
M1 = E[X] = p + 2
⇒ p = M1 − 2
Therefore, estimate of p will be
X1 + X2 + . . . + Xn
− 2.
n
So, the estimate of p for the given sample will be

2+2+4+3+1+3+1+2
p̂ = −2
8
18
= −2
8
= 0.25

Use the following values of CDF of standard normal distribution to answer the questions:

FZ (1.64) = 0.90, FZ (1.96) = 0.95

8. The weights (in grams) of mangoes grown in a certain area are normally distributed with
mean µ and standard deviation 40. The weights from a random sample of mangoes are
as follows:
220, 210, 240, 260, 235, 225, 270, 300, 200.
Find a 95% confidence interval for the mean weight of mangoes.

a) [203.87, 256.13]
b) [213.87, 266.13]
c) [230, 280]
d) [215.13, 235.87]

7
Solution:
n = 9, µ̂ = 240 and σ = 40.
β = 0.95, using CDF of Normal(0, 1),
α
√ = 1.96
σ/ n
40
α = 1.96 × √ = 26.13
9
P (|µ̂ − µ| < 26.13) = 0.95

So, 95% confidence interval is [240 - 26.13, 240 + 26.13] i.e. [213.87, 266.13]

9. From past experience it is known that the weights of seer fish grown at a commercial
hatchery are normal with a mean that varies from season to season but with a standard
deviation that remains fixed at 0.2 kilogram. If we want to be 90% certain that our
estimate of the present season’s mean weight of a seer fish is correct to within 0.01
kilograms, how large a sample is needed?
Solution:
Let X denote the weights of seer fish.
Given that σ = 0.2
To find the value of n such that P (|µ̂ − µ| ≤ 0.01) = 0.90

P (|µ̂ − µ| ≤ 0.01) = 0.90


 
µ̂ − µ 0.01
⇒P | √ |≤ √ = 0.90
σ/ n σ/ n
 
0.01
⇒ P |Z| ≤ √ = 0.90
σ/ n

0.01
√ = 1.64
σ/ n
√ 1.64
⇒ n = 0.2 ×
0.01
⇒ n = 1075.84

Therefore the sample size should be 1076.

10. The distribution of heights of a certain population of women is normally distributed


with µ unknown and σ unknown. We observe a random sample (in centimeters):
160, 155, 168, 167, 162, 150, 152, 148, 164.
Find a 95% confidence interval for µ. Use P (−2.30 < T8 < 2.30) = 0.95 where T8 is
t-distribution with degree of freedom 8.

a) [152.73, 164.15]

8
b) [156.67, 160.2]
c) [160.28, 167.72]
d) [150.34, 165.66]

Solution:
n = 9, µ̂ = 158.44 and S 2 = 55.52 ⇒ S = 7.45
X −µ
Using t−distribution, √ ∼ tn−1
S/ n
α
√ = 2.30
S/ n
7.45
α = 2.30 × √ = 5.71
9
P (|µ̂ − µ| < 5.71) = 0.95

So, 95% confidence interval is [158.44 - 5.71, 158.44 + 5.71] i.e. [152.73, 164.15].

9
Statistics for Data Science - 2
Week 8 Practice assignment

1. Let X1 , . . . , Xn be n i.i.d. samples from a random variable X with mean µ and variance
σ 2 . Let X̄ 2 be an estimator of µ2 where X̄(sample mean) is an unbiased estimator of
µ. Is the estimator X̄ 2 unbiased always?

(a) Yes
(b) No

Solution:
X1 + . . . + X n
X̄ =
n
Given X̄ is an unbiased estimator of µ and X̄ 2 is an estimator of µ2 .
=⇒ E[X̄] = µ
Now,

E[X̄ 2 ] =Var(X̄) + (E[X̄])2


σ2
= + µ2
n
6=µ2

Therefore, estimator X̄ 2 is not an unbiased estimator of µ2 .

2. Let X1 , X2 , . . . , Xn be n i.i.d. samples from a distribution with PDF


1 + θx
fX (x) = , −1 < x < 1
2

Let θ̂ = 3X̄ be an estimator of θ. Find the mean squared error of θ̂.


(3 − θ2 )
(a)
n
(3 + θ2 )
(b)
n
(3 + θ)
(c)
n
(3 − θ)
(d)
n

1
Solution:
Given θ̂ = 3X̄ an estimator of θ.
Expectation of X is given by
Z 1
E[X] = xfX (x)dx
−1
Z 1  
1 + θx
= x dx
−1 2
1 1
Z
= (x + θx2 )dx
2 −1
1
x2 θx3 θ
= + =
4 6 3
−1

Bias(θ̂, θ) =E[θ̂ − θ]
   
X1 + . . . + Xn
=E 3 −θ
n
 

=3 − E[θ] = 0
3n

Therefore, estimator θ̂ is unbiased.


Z 1
2
E[X ] = x2 fX (x)dx
−1
Z 1  
2 1 + θx
= x dx
−1 2
1 1 2
Z
= (x + θx3 )dx
2 −1
1
x3 θx4 1
= + =
6 8 3
−1

2
1 θ2
Therefore, Var[X] = −
3 9
  
X 1 + . . . + Xn
Var(θ̂) =Var 3
n
9
= 2 (nVar[X])
n  
1 θ2

9
= 2 n −
n 3 9
2
3−θ
=
n
3 − θ2
MSE(θ̂) = Bias(θ̂)2 + Var[θ̂] = .
n

3. Consider 100 samples X1 , X2 , . . . , X100 from a random variable X whose distribution


100 100
has mean µ and variance σ 2 . Let
P P 2
Xi = 150 and Xi = 1999. Find an unbiased
i=1 i=1
estimate for Var(X).

(a) 17.74
(b) 17.91
(c) 1.5
(d) 2.25

Solution:
Given the distribution of X has mean equal to µ and variance equal to σ 2 .
100
P 100
P 2
Also, Xi = 150 and Xi = 1999
i=1 i=1
1 P n
We know that S 2 = (Xi − X̄)2 is an unbiased estimator of Var[X].
n − 1 i=1

3
Therefore,
n
2 1 X
S = (Xi − X̄)2
n − 1 i=1
n
1 X 2
= (Xi + X̄ 2 − 2Xi X̄)
n − 1 i=1
n n
!
1 X X
= Xi2 + nX̄ 2 − 2X̄ Xi
n−1 i=1 i=1
n
!
1 X
= Xi2 + nX̄ 2 − 2nX̄ 2
n−1 i=1
n
!
1 X
= Xi2 − nX̄ 2
n−1 i=1
  n 2    n
2 
P P
1  X n
2
 i=1 Xi   1 
n
X 2 i=1
Xi 
= X − n = X −
    
n − 1  i=1 i n   n − 1  i=1 i n
 

1502
 
12
Therefore, S = 1999 − = 17.91
100 − 1 100
n
P
4. Let X1 , X2 , . . . , Xn ∼ i.i.d. X. Let a1 , . . . , an ≥ 0 such that ai = 1. Define the
i=1
n
ai xi . Define the estimator for the variance as S 2 =
P
estimator for mean as X̄ =
i=1
n
ai (Xi − X̄) with E[X] = µ and Var(X) = σ 2 . Choose the correct option(s) from
2
P
i=1
the following:

(a) X̄ is an unbiased estimator.


 
2 n−1
(b) E[S ] = σ2
n
 n

2
ai σ 2
2
P
(c) E[S ] = 1 −
i=1
n
(d) E[S 2 ] = a2i σ 2
P
i=1

(e) S 2 is an unbiased estimator for Var(X).

Solution:

4
Given X1 , X2 , . . . , Xn ∼ i.i.d. X, E[X] = µ, Var[X] = σ 2
n
P Pn
X̄ = ai xi is an estimator of µ, where ai = 1.
i=1 i=1

n
P n
P
(a) E[X̄] = E[a1 X1 + · · · + an Xn ] = ai E[X] = µ (since ai = 1)
i=1 i=1
Bias(X̄) = E[X̄] − E[X] = µ − µ = 0
Therefore, X̄ is an unbiased estimator of µ.
n n
a2i Var[X] = σ 2 a2i
P P
(b) Var[X̄] = Var[a1 X1 + · · · + an Xn ] =
i=1 i=1

E[X̄] =µ (1)
n
X
Var[X̄] =σ 2 a2i (2)
i=1

n
X
S2 = ai (Xi − X̄)2
i=1
n
X
= (ai Xi2 + ai X̄ 2 − 2ai Xi X̄)
i=1
Xn n
X n
X
= ai Xi2 + 2
ai X̄ − 2ai X̄Xi
i=1 i=1 i=1
Xn Xn
= ai Xi2 + X̄ 2 − 2X̄ 2 = ai Xi2 − X̄ 2
i=1 i=1

5
Now,
n
! n
X X
E[S 2 ] = E ai Xi2 − X̄ 2 = E[ai Xi2 ] − E[X̄ 2 ]
i=1 i=1
n
X
= ai E[Xi2 ] − E[X̄ 2 ]
i=1
n
X
= ai (σ 2 + µ2 ) − (Var[X̄] + µ2 )
i=1
n
X
=σ 2 + µ2 − σ 2 a2i − µ2 [From(2)]
i=1
n
X
=σ 2 − σ 2 a2i
i=1
n
!
X
= 1− a2i σ 2
i=1

Therefore, (b) is not true.


 n

2
ai σ 2 , therefore, (c) is true.
2
P
(c) Since E[S ] = 1 −
i=1
(d) (d) is not the correct option.
(e) Bias(S 2 ) = E[S 2 ] − σ 2 6= σ 2 .
Therefore, S 2 is not an unbiased estimator of Var[X].
5. Let X1 , . . . , Xn ∼ i.i.d. Uniform(−a, a). Find the ML estimator of a.
(a) âM L = max(| X1 |, . . . , | Xn |)
(b) âM L = max(X1 , . . . , Xn )
(c) âM L = min(X1 , . . . , Xn )
1
(d) âM L = n min(X1 , . . . , Xn )
2
Solution:
X1 , · · · , Xn ∼ Uniform(−a, a).
fXi (xi ) is given by 
1 for −a < xi < a
fXi (xi ) = 2a
0 otherwise
Likelihood function of a is given by
n  n
Y 1
L(x1 , x2 , . . . , xn ) = fX (xi ) =
i=1
2a

6
In order to maximise the likelihood function, we need to minimize a.
Since −a < xi < a for all i and | xi |< a, therefore, a = max(| x1 |, . . . , | xn |).
Therefore, the ML estimator of a is max(| X1 |, . . . , | Xn |).

6. Let X1 , X2 , X3 ∼ iid Normal(µ, σ 2 ). Given a random sample (−1, 0, 1), find the maxi-
mum likelihood estimate of σ 2 .
2
a) 3
7
b) 12
1
c) 3
5
d) 12

Solution:
n
(Xi − µ̂M L )2
P
i=1
ML estimator of σ 2 is , where µ̂M L = X̄.
n
−1 + 0 + 1
Given the samplings −1, 0, 1, X̄ = =0
3
(−1)2 + 02 + 12 2
Therefore, ML estimator of σ 2 is = .
3 3
7. Let X1 , . . . , Xn be n i.i.d. samples of a random variable X. Let X have the PDF
f (x) = (α + 1)xα , where 0 < x < 1.

(a) Find the ML estimator of α.


n
i. α̂M L = 1 + P
n
log Xi
i=1
n
ii. α̂M L = −1 − P
n
log Xi
i=1
n
iii. α̂M L = 1 − P
n
log Xi
i=1
n
iv. α̂M L = −1 + P
n
log Xi
i=1
Solution:
Given,
f (x) = (α + 1)xα , 0<x<1

7
Likelihood function of a sampling X1 , X2 , . . . , Xn will be given by
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi )
i=1
= (α + 1)n xα1 · · · xαn
⇒ log(L) = n log(α + 1) + α(log(x1 ) + · · · + log(xn ))

Therefore, ML estimator for α is given by

α̂ = arg max[n log(α + 1) + α(log(x1 ) + · · · + log(xn ))]


α

Let Y = n log(α + 1) + α(log(x1 ) + · · · + log(xn ))


Now,
dY d
= [n log(α + 1) + α(log(x1 ) + · · · + log(xn ))]
dα dα
n
= + log(x1 ) + · · · + log(xn )
α+1

Now,
dY
=0

n
⇒ = −[log(x1 ) + · · · + log(xn )]
α+1
n
⇒ α̂M L = −1 − Pn
log Xi
i=1

α+1
(b) The mean of the random variable X is α+2
. Find the estimator of α using method
of moments.
1 + 2M1
i. α̂M M E =
M1 − 1
1 − M1
ii. α̂M M E =
M1 − 1
1 + M1
iii. α̂M M E =
M1 − 1
1 − 2M1
iv. α̂M M E =
M1 − 1
Solution:

8
α+1
The expected value of X, E(X) is given as α+2 .
Using method of moments,
α+1
= m1
α+2
1 − 2m1
α=
m1 − 1
The estimator is
1 − 2M1
α̂M M E =
M1 − 1
8. Let X be a discrete random variable taking the values −1, 0, 1 with probabilities P (X =
p p
−1) = , P (X = 0) = , P (X = 1) = 1 − p. Let X1 , . . . , Xn ∼ i.i.d.{−1, 0, 1}. Find
2 2
the estimator of p using the method of moments.
2 − 2M1
(a)
3
2 + 2M1
(b)
3
1 + 2M1
(c)
3
2 + M1
(d)
3
Solution:
The expected value of X, E(X) is given by
X  p  p (2 − 3p)
E[X] = xpX (x) = −1 × + 0× + (1 × (1 − p)) =
x
2 2 2

(2 − 3p)
E[X] =
2
Using method of moments,
(2 − 3p)
= m1
2
The estimator is
2 − 2m1
p̂ =
3
2 − 2M1
p̂ =
3
9. Let X be a random variable with PDF
α
fX (x) = (λa)xα−1 e−λx , x > 0.
where α and a are constants. Find the maximum likelihood estimator of λ for n i.i.d.
samples of X.

9
n
Xiα
P
i=1
(a)
n
n
(b) P
n
Xiα
i=1
n
(c) n
Xiα
P
α
i=1
n
Xiα
P
i=1
(d)

Solution:
Given,
α
fX (x) = (λa)xα−1 e−λx , x>0
Likelihood function of a sampling X1 , X2 , . . . , Xn will be given by
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi )
i=1
α α
= (λa)n (x1 · · · xn )α−1 e−λ(x1 +···+xn )

Likelihood is a function of the parameter so, we can ignore the constant terms in the
likelihood function. Therefore,
α α
L = λn e−λ(x1 +···+xn )
⇒ log(L) = n log(λ) − λ(xα1 + · · · + xαn )

Therefore, ML estimator for λ is given by

λ̂ = arg max[n log(λ) − λ(xα1 + · · · + xαn )]


λ

Let Y = n log(λ) − λ(xα1 + · · · + xαn )


Now,
dY d
= [n log(λ) − λ(xα1 + · · · + xαn )]
dλ dλ
n
n X α
= − x
λ i=1 i

10
Now,
dY
=0

n
n X α
⇒ = xi
λ i=1
n
⇒λ = Pn
Xiα
i=1

10. A random sample of 1000 television screens taken from the household of a city shows
that the average running time of television is 7 hours per day with a standard deviation
of 2 hours. Assume the distribution of measurements to be approximately normal.
Calculate a 99% confidence interval for the daily average television running hours.
Hint: Use P (−2.58 < Z < 2.58) = 0.99.

(a) [6.02, 6.98]


(b) [7.02, 8.19]
(c) [6.12, 7.98]
(d) [6.83, 7.17]

Solution:
Given β = 0.99, n = 1000, X̄ = 7 and σ = 2.
To find: P (| X̄ − µ |≤ α) = 0.99
 
X̄ − µ α
P | √ |≤ √ =0.99
σ/ n σ/ n
 
α
=⇒ P | Z |≤ √ =0.99 where Z ∼ Normal(0, 1)
σ/ n
 
α α
=⇒ P − √ ≤ Z ≤ √ =0.99
σ/ n σ/ n

It is given that (−2.58 < Z < 2.58) = 0.99, therefore,


α σ 2
√ = 2.58 =⇒ α = 2.58 × √ = 2.58 × √ = 0.163
σ/ n n 1000

The confidence interval for µ is [X̄ − α, X̄ + α].


Therefore, 99% confidence interval for µ is [6.83, 7.17].

11
11. The distribution of the diameter of screws produced by a certain machine is normally
distributed with µ and σ unknown. We observe a random sample
9.8, 10.2, 10.4, 9.8, 10.0, 10.2 and 9.6 (in cm).
Find a 95% confidence interval for the mean diameter of screws.
Hint: Use P (−2.447 < t6 < 2.447) = 0.95 and S(sample standard deviation) = 0.283.
(a) [10.74, 11.26]
(b) [9.74, 10.26]
(c) [7.47, 8.26]
(d) [7.98, 8.75]
Solution:
Given that S = 0.283, n = 7, β = 0.95

9.8 + 10.2 + 10.4 + 9.8 + 10.0 + 10.2 + 9.6


Now, X̄ = =10
7
X̄ − µ
Using t-distribution, √ ∼ tn−1 .
S/ n
α
√ =2.447
S/ n
0.283
α =2.447 × √
7
=0.26
P (|µ̂ − µ| < 0.26) = 0.95
So, 95% confidence interval is [10 − 0.26, 10 + 0.26] = [9.74, 10.26].
12. A data scientist wishes to determine the average time it takes to run one epoch of a
machine learning model in her machine. How large a sample will she need to be 95%
confident that her sample mean will be within 15 seconds of the true mean? Assume
that it is known from previous studies that σ = 40 seconds.
Hint: Use P (−1.96 < Z < 1.96) = 0.95.
Answer: 28
Let X denote the time taken to run epoch of a machine learning model.
Given that σ = 40
To find the value of n such that P (|µ̂ − µ| ≤ 15) = 0.95
P (|µ̂ − µ| ≤ 15) = 0.95
 
µ̂ − µ 15
⇒P | √ |≤ √ = 0.95
σ/ n σ/ n
 
15
⇒ P |Z| ≤ √ = 0.95
σ/ n

12
Now,
15
√ = 1.96
σ/ n
√ 1.96
⇒ n = 40 ×
15
⇒ n = 27.31

Therefore, the sample size should be 28.

13
Statistics for Data Science - 2
Week 9 graded Assignment
Bayesian estimation

1. Suppose that the number of buses reaching a particular stop in an one-hour time period
follows the Poisson distribution with an unknown parameter λ. Previous records suggest
that the prior probabilities of λ are P (λ = 0.25) = 0.3 and P (λ = 0.20) = 0.7. If in a
particular one-hour time period seven buses reach the bus stop, find the posterior mode
of λ. Write your answer correct to two decimal places.
Solution:
Prior probabilities of λ are P (λ = 0.25) = 0.3 and P (λ = 0.20) = 0.7.

The posterior probabilities of λ will be

P (X = 7|λ = 0.25).P (λ = 0.25)


P (λ = 0.25|X = 7) =
P (X = 7)
−0.25
e (0.25)7 (0.3)
= ...(1)
7!P (X = 7)

P (X = 7|λ = 0.20).P (λ = 0.20)


P (λ = 0.20|X = 7) =
P (X = 7)
−0.20
e (0.20)7 (0.7)
= ...(2)
7!P (X = 7)

Dividing equation (2) by (1), we get

P (λ = 0.20|X = 7) e−0.20 (0.20)7 (0.7)


= −0.25
P (λ = 0.25|X = 7) e (0.25)7 (0.3)
7e0.05 47
= <1
3(57 )

It implies that P (λ = 0.20|X = 7) < P (λ = 0.25|X = 7)

Therefore, λ = 0.25 is the posterior mode.


2. Outcomes on rolling a die ten times are:
1, 3, 4, 3, 2, 5, 4, 6, 4, 1
Use the Uniform[0, 1] prior to find the posterior mean of p, which denotes the probability
of getting an even number.

Solution:
Let p denote the probability of getting an even number.
Prior distribution of p is fp ∼ Uniform[0, 1].
It implies that fp (p) = 1

Now, posterior density ∝ P (X1 = x1 , . . . , Xn = xn |p = p)fp (p)


⇒ posterior density ∝ p5 (1 − p)5 (1)
⇒ posterior density ∝ p5 (1 − p)5
⇒ posterior density = Beta(6, 6)
6 6
⇒ posterior mean = = = 0.5
6+6 12

3. Let X1 , X2 , . . . , Xn ∼ i.i.d. Exp(λ), where λ is an unknown parameter. Find the poste-


rior mean of λ assuming the prior distribution of λ to be Exp(µ).
n
(a)
X1 + X2 + . . . + Xn
n
(b)
µ + X1 + X2 + . . . + Xn
n+1
(c)
X1 + X2 + . . . + Xn
n+1
(d)
µ + X1 + X2 + . . . + Xn
Solution:
Let Λ be the prior distribution of λ.
From the given information, fΛ (λ) ∼ Exp(µ).

It implies that fΛ (λ) = µe−µλ .

Now, posterior density ∝ P (X1 = x1 , . . . , Xn = xn |Λ = λ)fΛ (λ)


⇒ posterior density ∝ λn e−λ(X1 +X2 +...+Xn ) (µe−µλ )
⇒ posterior density ∝ λn e−λ(X1 +X2 +...+Xn +µ)
⇒ posterior density = Gamma(n + 1, X1 + X2 + . . . + Xn + µ)
n+1
⇒ posterior mean =
X1 + X2 + . . . + Xn + µ

Page 2
4. Call duration of daily stand up meetings of employees of a certain company follows the
exponential distribution with an unknown parameter λ. Duration (in minutes) of last
ten meetings are 20, 30, 35, 30, 25, 25, 20, 28, 34, 30. Find the Bayesian estimate
1
(posterior mean) of λ using the prior distribution of Exp( ) for λ. Write your answer
15
correct to two decimal places.

Solution:
Let Λ be the prior distribution of λ.
1
From the given information, fΛ (λ) ∼ Exp( ).
15
1 −λ/15
It implies that fΛ (λ) = e .
15

Now, posterior density ∝ P (X1 = x1 , . . . , Xn = xn |Λ = λ)fΛ (λ)


1
⇒ posterior density ∝ λn e−λ(X1 +X2 +...+Xn ) ( e−λ/15 )
15
1
−λ(X1 +X2 +...+Xn + )
⇒ posterior density λn e 15
1
⇒ posterior density = Gamma(n + 1, X1 + X2 + . . . + Xn + )
15
n+1 11
⇒ posterior mean = =
1 1
X1 + X2 + . . . + Xn + 277 +
15 15
11 × 15
⇒ posterior mean = = 0.03
15 × 277 + 1

5. Marks of tenth class students of a school follow the normal distribution with an un-
known mean µ and variance 25. Marks of 10 students of the tenth class are 50, 45, 70,
60, 75, 90, 45, 60, 80, 75. Find the Bayesian estimate (posterior mean) of µ assuming
the Normal(50, 25) prior distribution. Write your answer correct to two decimal places.

Solution:
We know that normal distribution is conjugate to the normal distribution. That is if
prior distribution of µ is normal(µ0 , σ02 ) and sample is taken from Normal(µ, σ), then
nσ 2 µ0 σ 2
posterior distribution of the µ will be Normal with mean X 2 0 2 + 2
nσ0 + σ nσ0 + σ 2

Here, X = 65, n = 10, µ0 = 50, σ02 = 25, σ 2 = 25

Page 3
Therefore,
65 × 10 × 25 50 × 25
Posterior mean = +
10(25) + 25 10(25) + 25
16250 1250
= +
275 275
= 59.09 + 4.545 = 63.63

6. The outcomes on tossing a coin ten times are: H T T H T H H H T H. Let p be the


probability of heads. Previous records show that heads appear on an average 40% of
the time. Find the posterior mean of p using the Beta(2, β) prior. Write your answer
correct to two decimal places.

Solution:
Let p denote the probability of heads.
Given that prior of p is Beta(2, β) with an average of 0.4.
It implies that E[Beta(2, β)] = 0.4
2
⇒ = 0.4
2+β
⇒β=3

Therefore, prior distribution of p is Beta(2, 3)


It implies that fp (p) ∝ p1 (1 − p)2

Now, posterior density ∝ P (X1 = x1 , . . . , Xn = xn |p = p)fp (p)


⇒ posterior density ∝ p6 (1 − p)4 (p1 (1 − p)2 )
⇒ posterior density ∝ p7 (1 − p)6
⇒ posterior density = Beta(8, 7)
8 8
⇒ posterior mean = = = 0.53
8+7 15

7. One out of the last ten candidates wins a treasure hunt game. Previous record shows
fraction of winners follows the Beta(20, b) distribution with an average of 20%. Estimate
the long-term fraction of winners of the treasure hunt game. Write your answer correct
to two decimal places.

Solution:
Let the long-term fraction of winners (probability of winning) be denoted by p.
Previous data shows that fraction of winners follows the Beta(20, b) distribution with an
average of 20%.
It implies that E[Beta(20, b)] = 0.2

Page 4
20
⇒ = 0.2
20 + b
⇒ b = 80

Therefore, prior distribution of p is Beta(20, 80)


It implies that fp (p) ∝ p19 (1 − p)79

Now, posterior density ∝ P (X1 = x1 , . . . , Xn = xn |p = p)fp (p)


⇒ posterior density ∝ p1 (1 − p)9 (p19 (1 − p)79 )
⇒ posterior density ∝ p20 (1 − p)88
⇒ posterior density = Beta(21, 89)
21 21
⇒ posterior mean = = = 0.19
21 + 89 110

8. Rainfall in the monsoon season in Delhi follows normal distribution with mean µ and
variance 200 mm. Rainfall (in mm) registered in the 2021 monsoon are 600, 300, 450,
700, 850, 150, 200, 750. Prior information about the average rainfall is that it has mean
600 mm and variance 225 mm. Use the normal prior that matches your prior information
and find the posterior mean.

Solution:
Prior distribution of µ is given Normal with mean 600 and variance 225.
We know that normal distribution is conjugate to the normal distribution. That is if
prior distribution of µ is normal(µ0 , σ02 ) and sample is taken from Normal(µ, σ 2 ), then
nσ 2 µ0 σ 2
posterior distribution of the µ will be Normal with mean X 2 0 2 + 2
nσ0 + σ nσ0 + σ 2

Here, X = 500, n = 8, µ0 = 600, σ02 = 225, σ 2 = 200

Therefore,
500 × 8 × 225 600 × 200
Posterior mean = +
8(225) + 200 8(225) + 200
900000 120000
= +
2000 2000
= 450 + 60 = 510

9. Following frequency data shows the number of patients (n) arriving in an emergency
room between 12:00 AM and 6:00 AM.

Page 5
n frequency n frequency
0 1 6 14
1 4 7 4
2 17 8 4
3 17 9 1
4 17 10+ 0
5 21

(i) Fit the data into Poisson distribution (Find the parameter). Write your answer
correct to two decimal places.
Solution:
We know that λ̂ = X is an estimate of λ.
P
f i ni
i
Sample mean, X = P
fi
i
0 + 4 + 34 + 51 + 68 + 105 + 84 + 28 + 32 + 9
=
1 + 4 + 17 + 17 + 17 + 21 + 14 + 4 + 4 + 1
415
= = 4.15
100
Therefore, λ̂ = 4.15

(ii) Find an approximate 95% confidence interval using a normal approximation for the
error distribution.
(Use the following information:
sample variance S 2 = 3.40 and P (−0.36 < N(0, 0.034) < 0.36) = 0.95)
(a) [3.87, 5, 134]
(b) [3.79, 4.51]
(c) [3.12, 5.21]
(d) [4.01, 5.23]
Solution:
Error, e is given by
e = λ̂ − λ
Now, E[λ̂ − λ] = E[λ̂] − λ = λ − λ = 0
σ2 s2 3.4
Var(λ̂ − λ) = Var(λ) = ≈ = = 0.034
n n 100

It implies that error follows Normal(0, 0.034).

Let 95% confidence interval be [λ̂ − δ, λ̂ + δ]. Now,


P (|error| < δ) = 0.95
⇒P (|Normal(0, 0.034)| < δ) = 0.95

Page 6
It is given that P (−0.36 < N(0, 0.034) < 0.36) = 0.95
Therefore δ = 0.36

So, 95% confidence interval will be [3.79, 4.51].

10. Following frequency table shows the number of bankruptcies (n) filed by customers in a
time period of one month. The data consists of last 200 months.

n frequency n frequency
0 13 6 8
1 26 7 4
2 48 8 1
3 44 9 1
4 39 10+ 0
5 16

(i) Fit the data into Poisson distribution (Find the parameter). Write your answer
correct to two decimal places.
Solution:
We know that λ̂ = X is an estimate of λ.
P
fi ni
i
Sample mean, X = P
fi
i
0 + 26 + 96 + 132 + 156 + 80 + 48 + 28 + 8 + 9
=
13 + 26 + 48 + 44 + 39 + 16 + 8 + 4 + 1 + 1
583
= = 2.91
200

Therefore, λ̂ = 2.91

(ii) Find an approximate 95% confidence interval using a normal approximation for the
error distribution.
(Use the following information:
sample variance S 2 = 2.852 and P (−0.23 < N(0, 0.0142) < 0.23) = 0.95)
(a) [1.97, 4.14]
(b) [2.08, 3.34]
(c) [2.68, 3.14]
(d) [2.01, 4.232]

Solution:
Error, e is given by
e = λ̂ − λ

Page 7
Now, E[λ̂ − λ] = E[λ̂] − λ = λ − λ = 0
σ2 s2 2.852
Var(λ̂ − λ) = Var(λ) = ≈ = = 0.0142
n n 200

It implies that error follows Normal(0, 0.0142).

Let 95% confidence interval be [λ̂ − δ, λ̂ + δ]. Now,

P (|error| < δ) = 0.95


⇒P (|Normal(0, 0.0142)| < δ) = 0.95

It is given that P (−0.23 < N(0, 0.0142) < 0.23) = 0.95


Therefore δ = 0.23

So, 95% confidence interval will be [2.68, 3.14]

Page 8
Statistics for Data Science - 2

Week 9 Practice Assignment Solution

1. Let p be the proportion of students in IITM online degree programme who approve the
online proctored exams. The students’ committee is going to take a random sample of
n = 40 students from IITM online degree programme and ask if they approve the online
proctored exams. Suppose 10 out of the 40 students answered yes.
i) Calculate the posterior distribution if we use a continuous Uniform[0, 1] prior.
a) Beta(10, 30)
b) Beta(11, 31)
c) Beta(10, 40)
d) Beta(11, 40)
Solution:
Let fp (p) denote the prior distribution of p.
Then, by given information fp (p) = 1, since, p ∼ Uniform[0, 1].
If X1 , X2 , . . . , Xn ∼ iid Bernoulli(p)
⇒ posterior density = Beta(w + 1, n − w + 1) where w is the number of success.
Here n = 40, w = 10 ⇒ posterior density = Beta(11, 31)
ii) Find the Bayesian estimate (posterior mean) of p. Enter your answer correct to two
decimals accuracy.
Solution:
11
posterior mean = = 0.26
11 + 31

iii) Find the Bayesian estimate (posterior mean) with Beta(5, 5) prior. Enter your an-
swer correct to two decimals accuracy.
Solution:
Given the prior distribution is Beta(α, β)
X1 , X2 , . . . , Xn ∼ iid Bernoulli(p)
⇒ posterior density = Beta(w + α, n − w + β)
Here n = 40, w = 10, α = 5, β = 5
⇒ posterior density = Beta(15, 35)
15
posterior mean = = 0.30
15 + 35

iv) Find the Bayesian estimate (posterior mean) with Beta(10, 10) prior. Enter your
answer correct to two decimals accuracy.
Solution:
Here n = 40, w = 10, α = 10, β = 10

1
⇒ posterior density = Beta(20, 40)
20
posterior mean = = 0.33
20 + 40
2. The new method of screening for a disease fails to detect the presence of the disease in
20% of the patients from prior experience. A new random sample of n = 100 patients
who are known to have the disease is screened using the new method. Out of these 100
patients, the new method failed to detect the disease in 20 cases. Use a Beta(2, β) with
a suitable β to estimate the failure fraction. Enter your answer correct to two decimals
accuracy.
Solution:
The prior is given as Beta(2, β) with the information that the new method failed to
detect the disease in 20 cases.
2
⇒ = 0.20
2+β
⇒ β = 8.
If X1 , X2 , . . . , Xn ∼ iid Bernoulli(p)
⇒ posterior density = Beta(w + α, n − w + β)
Here n = 100, w = 20, α = 2, β = 8
⇒ posterior density = Beta(22, 88)
22
posterior mean = = 0.20
22 + 88

3. Suppose that the number of customers arriving in a restaurant in a one day time period
follows the Poisson distribution with unknown parameter λ. Previous records suggest
that the prior probabilities of λ are P (λ = 10) = 0.4 and P (λ = 8) = 0.6. If on a
particular day 15 people arrive at the restaurant, find the posterior mode of λ.
Solution:

P (X = 15) = P (X = 15 | λ = 10)P (λ = 10) + P (X = 15 | λ = 8)P (λ = 8)


e−10 1015 e−8 815
= × 0.4 + × 0.6
15! 15!
Now,

P (λ = 10 | X = 15) = P (X = 15 | λ = 10)P (λ = 10)/P (X = 15)


e−10 1015 × 0.4
= −10 15
e 10 × 0.4 + e−8 815 × 0.6

2
And

P (λ = 8 | X = 15) = P (X = 15 | λ = 8)P (λ = 8)/P (X = 15)


e−8 815 × 0.6
= −10 15
e 10 × 0.4 + e−8 815 × 0.6

P (λ = 10 | X = 15) e−10 1015 × 0.4


⇒ = −8 15
P (λ = 8 | X = 15) e 8 × 0.6
= 2.56
⇒ P (λ = 10 | X = 15) > P (λ = 8 | X = 15)

Hence, the posterior mode of λ is 10.

4. Consider a Bayesian estimation problem, with X1 , . . . , Xn ∼ iid Normal(µ, 1), and a


n
P
Normal(0, 1) prior. Letting Yn = Xi , the posterior mean is
i=1

Yn
a)
n
Yn
b)
n+1
nYn
c)
n+1
nYn
d)
n+2
Solution:
If X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), and the prior is Normal(µ0 , σ02 )
nσ 2 σ2
then posterior density = X 2 0 2 + µ0 2
nσ0 + σ nσ0 + σ 2
2
Here σ = 1, µ0 = 0, σ0 = 1  
X1 + . . . + Xn n×1
⇒ posterior mean =
n n×1+1
Yn
⇒ posterior mean =
n+1
5. The marks distribution of IITM students in the end semester exam follows normal distri-
bution with unknown mean µ and variance 20. A random sample of marks of 8 students
are:
60, 60, 65, 65, 70, 70, 72, 75.
i) Assume that the prior distribution is Normal(50, 5). Find the posterior mean of µ .
Enter your answer correct to two decimals accuracy.

3
Solution:
60 + 60 + 65 + 65 + 70 + 70 + 72 + 75
X= = 67.125
8
Here X = 67.125, σ 2 = 20, n = 8, µ0 = 50, σ02 = 5

   
8×5 20
posterior mean = 67.125 + 50
8 × 5 + 20 8 × 5 + 20
   
40 20
= 67.125 + 50
60 60
= 61.416

ii) Assume that the prior distribution is Normal(50, 25). Find the posterior mean of µ .
Enter your answer correct to two decimals accuracy.
Solution:
60 + 60 + 65 + 65 + 70 + 70 + 72 + 75
X= = 67.125
8
Here X = 67.125, σ 2 = 20, n = 8, µ0 = 50, σ02 = 25

   
8 × 25 20
posterior mean = 67.125 + 50
8 × 25 + 20 8 × 25 + 20
   
200 20
= 67.125 + 50
220 220
= 65.56

6. Suppose X is a discrete random variable taking values {1, 2, 3} with respective proba-
bilities {p, 2(1 − p)/3, (1 − p)/3}, where 0 ≤ p ≤ 1 is a parameter. Consider the samples
1, 1, 3, 1, 3, 2, 1, 2, 3, 2 taken from X.
Use a Uniform[0, 1] prior on p to find the posterior mean. Enter your answer correct to
two decimals accuracy.
Solution:
Let fp (p) denote the prior distribution of p.
Then, by given information fp (p) = 1, since, p ∼ Uniform[0, 1].

Now, posterior density ∝ P (X1 = x1 , . . . , Xn = xn |p = p)fp (p)


⇒ posterior density ∝ pn1 (1 − p)n2 +n3 where ni denotes the number of i in the samples.
Here n1 = 4, n2 = 3, n3 = 3
⇒ posterior density ∝ p4 (1 − p)3+3
⇒ posterior density = Beta(5, 7)
5
⇒ posterior mean = = 0.416
5+7

4
7. The following ten samples are taken from the Geometric(p):
2, 4, 10, 8, 12, 6, 14, 6, 3, 5.
Find the posterior mean of p using Uniform[0, 1] prior. Enter your answer correct to two
decimals accuracy.
Solution:
If X1 , . . . , Xn ∼ iid Geometric(p), and the prior is Uniform[0, 1],
then posterior density = Beta(n + 1, x1 + x2 + . . . + xn − n + 1)
Here n = 10, x1 + x2 + . . . + xn = 2 + 4 + 10 + 8 + 12 + 6 + 14 + 6 + 3 + 5 = 70
⇒ posterior density = Beta(11, 61).
11
⇒ posterior mean = = 0.15
11 + 61
8. Consider the samples 7, 5, 0, 2, 10, 4, 9, 8, 3 taken from Poisson(λ), where λ is unknown.
Using a Gamma(4, 11) prior, find the posterior mean of λ. Enter your answer correct to
one decimal accuracy.
Solution:
If X1 , . . . , Xn ∼ iid Poisson(λ), and the prior is Gamma(α, β)
then posterior density = Gamma(x1 + x2 + . . . + xn + α, β + n)
Here n = 9, x1 + x2 + . . . + xn = 7 + 5 + 0 + 2 + 10 + 4 + 9 + 8 + 3 = 48, α = 4, β = 11
⇒ posterior density = Gamma(52, 20)
52
⇒ posterior mean = = 2.6
20
9. The number of defects per 10 meters of cloth produced by a weaving machine has the
Poisson distribution with mean λ. You examine 100 meters of cloth produced by the
machine and observe 61 defects. Your prior belief about λ is that it has mean 6 and
standard deviation 2. Use a Gamma(α, β) prior that matches your prior belief and find
the posterior distribution.
a) Gamma(70, 111.5)
b) Gamma(70, 11.5)
c) Gamma(61, 11.5)
d) Gamma(61, 13)
Solution:
Prior is Gamma(α, β) with mean 6 and standard deviation 2.
α α
⇒ = 6 and 2 = 4
β β
⇒ α = 6β and α = 4β 2

Solving these two equations we will get


α = 9 and β = 1.5.
Also n = 10 and x1 + x2 + . . . + xn = 61
⇒ posterior distribution = Gamma(70, 11.5)

5
10. Assume that the time that elapses from one call to the next at a 911 call center has
the exponential distribution with parameter λ. The time elasped between ten calls (in
minutes) are: 3, 4, 6, 1, 7, 8, 2, 5, 1. Your prior belief about λ is that it has mean 3.5 and
standard deviation 1. Use a Gamma(α, β) prior that matches your prior belief and find
the posterior mean. Enter your answer correct to two decimals accuracy.
Solution:
Prior is Gamma(α, β) with mean 6 and standard deviation 2.
α α
⇒ = 3.5 and 2 = 1
β β
⇒ α = 3.5β and α = β 2

Solving these two equations we will get


α = 12.25 and β = 3.5.
If X1 , . . . , Xn ∼ iid Exponential(λ), and the prior is Gamma(α, β)
then posterior density = Gamma(n + α, β + x1 + x2 + . . . + xn )
Here n = 9, x1 + x2 + . . . + xn = 3 + 4 + 6 + 1 + 7 + 8 + 2 + 5 + 1 = 37, α = 12.25, β = 3.5
⇒ posterior density = Gamma(21.25, 40.5)
⇒ posterior mean = 21.25 40.5
= 0.52

11. The frequency data on number of deaths per month due to a certain disease is given
below:

No. of deaths per month Frequency


0 224
1 102
2 23
3 5
4 1
5+ 0

Table 9.1.P

6
(i) Fit a Poisson distribution to the given frequency table and find the parameter.
Write your answer correct to two decimal places.
Solution:
Let λ̂ = X̄ be an estimate of λ.
P
ni f i
Sample mean(X̄) = P
fi

(0 × 224) + (1 × 102) + (2 × 23) + (3 × 5) + (4 × 1)


=
224 + 102 + 23 + 5 + 1
167
=
355
=0.47

Therefore, λ̂ = 0.47.
Therefore, the distribution is Poisson(0.47).

n Frequency Poisson fit


0 224 (e−0.47 )355 = 221.87
1 102 (e−0.47 (0.47)/1!)355 = 104.28
2 23 (e−0.47 (0.47)2 /2!)355 = 24.05
3 5 (e−0.47 (0.47)3 /3!)355 = 3.83
4 1 (e−0.47 (0.47)4 /4!)355 = 0.45
5+ 0 (e−0.47 (0.47)5 /5!)355 = 0.04

As we can observe from the table that the actual count is close to the expected
count, therefore, Poisson(0.47) is a reasonable fit for the given data.
(ii) Find an approximate 95% confidence interval using a normal approximation for the
sampling distribution.
(Use the following information:
sample variance S 2 = 0.498 and P (−0.07 < N(0, 0.0014) < 0.07) = 0.95)
(a) [0.40, 0.47]
(b) [0.40, 0.54]
(c) [0.44, 0.54]
(d) [0.44, 0.52]

Solution:
Error: λ̂ − λ
E[λ̂ − λ] = 0
σ2 S2
Var(λ̂ − λ) = Var(λ̂) = ≈
n n

7
 2
s
Therefore, we will assume the sampling distribution to be Normal 0, .
n
Given that, sample variance (s2 ) = 0.498.
Therefore, the sampling distribution is Normal(0, 0.0014).
Now, 95% confidence interval for λ is [λ̂ − δ1 , λ̂ − δ2 ].
It is given that P (−0.07 < N(0, 0.0014) < 0.07) = 0.95, therefore,

δ1 = 0.07 and δ2 = −0.07

Hence the 95% confidence interval for λ is [0.40, 0.54].

12. The number of emails received by Neeti in intervals of one hour is given in Table 9.2.P.

No. of emails per hour Frequency


0 5
1 15
2 22
3 22
4 17
5 10
6 5
7 3
8 1
9+ 0

Table 9.2.P: Emails received by Neeti in one-hour interval for the last 100 hours.

(i) Fit a Poisson distribution to the given frequency table and find the parameter.
Write your answer correct to two decimal places.

Solution:
Let λ̂ = X̄ be an estimate of λ.
P
ni f i
Sample mean(X̄) = P
fi
(1 × 15) + (2 × 22) + (3 × 22) + (4 × 17) + (5 × 10) + (6 × 5) + (7 × 3) + (8 × 1)
⇒ X̄ =
5 + 15 + 22 + 22 + 17 + 10 + 5 + 3 + 1
302
⇒ X̄ = = 3.02
100
Therefore, λ̂ = 3.02.
Therefore, the distribution is Poisson(3.02).
We can check the fit, the same way we did in the previous question.

8
(ii) Find an approximate 95% confidence interval using a normal approximation for the
sampling distribution.
(Use the following information:
sample variance S 2 = 3.05 and P (−0.34 < N(0, 0.0305) < 0.34) = 0.95)
(a) [1.89, 4.15]
(b) [2.08, 4.34]
(c) [2.68, 3.36]
(d) [1.89, 3.35]

Solution:
Error: λ̂ − λ
E[λ̂ − λ] = 0
σ2 S2
Var(λ̂ − λ) = Var(λ̂) = ≈
n n  2
s
Therefore, we will assume the sampling distribution to be Normal 0, .
n
Given that, sample variance (s2 ) = 3.05.
Therefore, the sampling distribution is Normal(0, 0.0305).
Now, 95% confidence interval for λ is [λ̂ − δ1 , λ̂ − δ2 ].
It is given that P (−0.34 < N(0, 0.0305) < 0.34) = 0.95, therefore,

δ1 = 0.34 and δ2 = −0.34

Hence the 95% confidence interval for λ is [2.68, 3.36].

9
Statistics for Data Science - 2

Graded assignment week 10

Use the following values of standard normal distribution if needed.

FZ (0.15) = 0.55962, FZ (−0.04) = 0.48405, FZ (−1.28) = 0.10027, FZ (1.96) = 0.975, FZ (−1.64) =


0.05, FZ (1.64) = 0.95, FZ (−1.28) = 0.01, FZ (2.74) = .99693, FZ (−1.36) = .08691, FZ (1.28) =
.89973, FZ (2.54) = .99446, FZ (−2.8) = .00256

1. The average marks scored by students of a school in their board exams is reported to
be 400 with a standard deviation of 5. You suspect that the average may be lower,
possibly 390, and decide to sample students to find their marks.
(a) What sample size do you need for a test at the significance level 0.05 and power
0.95?
Answer: 3

Solution:
Let the random variable X represent the marks obtained by students in their
board exams with expected value µ and standard deviation σ.
Given µ = 400 and σ = 5.
Consider the Null and alternative hypothesis:
H0 : µ = 400
HA : µ < 400
Test Statistics: X
Test: Reject H0 , if X < c at α = 0.05
α = P (reject H0 |H0 is true)
 
X − 400 c − 400
=P √ < √
5/ n 5/ n
 
c − 400
=⇒ 0.05 = FZ √
5/ n
c − 400
=⇒ FZ−1 (0.05) = √
5/ n
c − 400
=⇒ −1.64 = √
5/ n
5
=⇒ c = −1.64 × √ + 400 · · · (1)
n

1
Again, when alternative hypothesis is true, we have

X − 390
√ ∼ Normal(0, 1)
5/ n

β = P (Accept H0 |HA is true)


P (X ≥ c | µ = 390)
 
X − 390 c − 390
P √ ≥ √
5/ n 5/ n
 
c − 390
=⇒ 0.05 = 1 − FZ √
5/ n
c − 390
=⇒ FZ−1 (0.95) = √
5/ n
c − 390
=⇒ 1.64 = √
5/ n
5
=⇒ c = 1.64 × √ + 390 · · · (2)
n

From equation (1) and (2), we have


5 5
− 1.64 × √ + 400 = 1.64 × √ + 390
n n
5
=⇒ 2 × 1.64 × √ = 10
n
2
=⇒ n = 1.64 = 2.6896 ≈ 3

(b) Find the critical value c. Enter the answer correct to two decimal places.
394.73, [394.70, 395.30]
Solution: Substituting the value of c in (2), we get c = 394.73.

2. Suppose X ∼ Normal(µ, 9). For n = 100 iid samples of X, the observed sample mean
is 11.8. What conclusion would a z-test reach if the null hypothesis assumes µ = 10.5
(against an alternative hypothesis µ 6= 10.5)?

(a) Accept H0 at a significance level of 0.10.


(b) Reject H0 at a significance level of 0.10.
(c) Accept H0 at a significance level of 0.05.
(d) Reject H0 at a significance level of 0.05.

Solution:
Given, X ∼ Normal(µ, 9).

2
X1 , . . . , X100 ∼ iid X. For 100 iid samples of X, X̄ ∼ Normal(µ, 9/100)
Sample mean, X̄ = 11.8.
Consider the null and alternative hypotheses: Null hypothesis,

H0 : µ = 10.5

HA : µ 6= 10.5
Test for α = 0.05
Test: Reject H0 if |X̄ − µ| > c

α = P (| X̄ − 10.5 |> c|µ = 10.5)


!
X̄ − 10.5 c
⇒α=P | p |> p
9/100 9/100
!
c
⇒ α = P | z |> p
9/100
!
−c
⇒ 0.05 = 2Fz p
9/100
3 −1
=⇒ c = − F (0.025) = 0.5879 (1)
10 z
Since | 11.8 − 10.5 |> c, z-test at significance level (α) = 0.05, will reject H0 .

Test for α = 0.1


Test: Reject H0 if |X̄ − µ| > c

α = P (| X̄ − 10.5 |> c|µ = 10.5)


!
X̄ − 10.5 c
α=P | p |> p
25/100 9/100
!
c
α = P | z |> p
9/100
!
−c
0.1 = 2Fz p
9/100
3 −1
c=− F (0.05) = 0.4934 (2)
10 z
Since | 11.8 − 10.5 |> c, z-test at significance level (α) = 0.10, will reject H0 .
Hence, options (b) and (d) are correct.

3
3. Let X1 , . . . , X100 be a sample from a normal distribution having a variance of 25. We
wish to test the hypothesis H0 : µ = 0 versus HA : µ = 1.5. Consider a test that rejects
H0 for X̄ > c.
(a) Find the value of c at a significance level α = 0.05. Enter the answer correct to
two decimal places.
0.82
Solution:
Given, X ∼ Normal(µ, 25).
X1 , . . . , X100 ∼ iid X. For 100 iid samples of X, X̄ ∼ Normal(µ, 25/100)
The null and alternative hypothesis are
H0 : µ = 0
HA : µ > 0
Test for α = 0.05
Test: Reject H0 if X̄ > c
α = P (X̄ > c|µ = 0)
!
X̄ c
⇒α=P p >p
25/100 25/100
⇒ α = P (z > 2c)
⇒ 0.05 = 1 − Fz (2c)
1
⇒ c = Fz−1 (0.95) = 0.8224
2
(b) Find the power of the test. Enter the answer correct to two decimal places.
0.91309, [0.90, 0.93]
Solution:
Power = 1 − β = P (X̄ > c|µ = 1.5)
!
X − 1.5 c − 1.5
⇒1−β =P p >p
25/100 25/100
!
c − 1.5
⇒1−β =P z > p
25/100
⇒ 1 − β = 1 − Fz (2(c − 1.5))
Substituting the value of c from above problem,
1 − β = 1 − Fz (2(0.8224 − 1.5))
Therefore,
Power = 1 − β = 0.9123

4
4. A manufacturer supplies fuses, approximately 90% of which function properly. A new
process is initiated whose purpose is to increase the proportion of properly functioning
fuses. We obtain a random sample of 100 such fuses manufactured by the new process
and found out that 8 of them are not functioning properly. Let p denotes the proportion
of properly functioning fuses. (Use normal approximation to binomial)

(a) Define null hypothesis and alternative hypothesis.


i. H0 : p = 0.90, HA : p 6= 0.90
ii. H0 : p = 0.90, HA : p < 0.90
iii. H0 : p = 0.90, HA : p > 0.90
iv. H0 : X = 0.90, HA : X > 0.90
(b) Choose the correct options from the following:
i. Accept H0 at a significance level of 0.05.
ii. Reject H0 at a significance level of 0.10.
iii. Accept H0 at a significance level of 0.10.
iv. Reject H0 at a significance level of 0.05.

Solution:
X1 , . . . , X100 ∼ iid Bernoulli(p).
The null and alternative hypothesis are:

H0 : p = 0.9

HA : p > 0.9
92
Sample mean (X̄) = = 0.92
100
Test for α = 0.05
Test statistic, T = X1 + . . . + X100 ∼ Binomial(n, p) which can be normally approxi-
mated as
T ≈ Normal(100p, 100p(1 − p))
 
p(1 − p)
X ≈ Normal p,
100
Test: Reject H0 if X̄ > c
α = P (X̄ > c | p = 0.9)
 
 X̄ − p c−p 
α=P r
 p(1 − p) > r 
p(1 − p) 
n n

5
 
 c − 0.9 
α=P z > r 
 0.9 × 0.1 
100
 
c − 0.9
⇒ 0.05 = 1 − Fz
0.03
⇒ c = 0.9 + 0.03 × Fz−1 (0.95)
c = 0.9493
Since X̄ < c, z-test at significance level (α) = 0.05, will accept H0 .
Test for α = 0.10
Test statistic, T = X1 + . . . + X100 ∼ Binomial(n, p) which can be normally approxi-
mated as
T ≈ Normal(100p, 100p(1 − p))
 
p(1 − p)
X̄ ≈ Normal p,
100
Test: Reject H0 if X̄ > c
α = P (X̄ > c | p = 0.9)
 
 X̄ − p c−p 
α=P r
 p(1 − p) > r 
p(1 − p) 
n n
 
 c − 0.9 
α=P z > r 
 0.9 × 0.1 
100
 
c − 0.9
⇒ 0.10 = 1 − Fz
0.03
⇒ c = 0.9 + 0.03 × Fz−1 (0.90)
⇒ c = 0.9384
Since X̄ < c, z-test at significance level (α) = 0.10, we will accept H0 .
Hence, options (i) and (iii) are correct.

5. A commonly prescribed drug for relieving nervous tension is believed to be only 25%
effective. To determine if a new drug is superior in providing relief, suppose that 100
people who were suffering with nervous tension are chosen at random and inoculated.
If more than 36 of them are found to be relieved, we reject the null hypothesis that
p = 1/4 and the new drug will be considered superior to the one presently in use. (Use

6
normal approximation to binomial)

(a) Find the critical value c.


Answer: 36

Solution: Since, we will reject the null hypothesis if more than 36 out of 100
patients is found to be relieved, 36 is the critical value.
(b) Find P (Type I error). Enter the answer correct to four decimal places.
[0.0038, 0.0060]
Solution:

The null and alternative hypothesis are:

H0 : p = 0.25

HA : p > 0.25
Given, critical value (c) = 36
Test statistic, T = X1 + . . . + X100 ∼ Binomial(100, p) which can be normally
approximated as
T ≈ Normal(100p, 100p(1 − p))
Test: Reject H0 if T > c that is T > 36

α = P (T > c | p = 0.25)

α = P (T > 36 | p = 0.25)
!
36 − 100p
α=P z> p
100p(1 − p)
!
36 − 100(0.25)
α=P z>p
100 × 0.25(0.75)
 
11
α=P z> √
18.75
 
11
α = 1 − Fz √
18.75
α = 1 − Fz (2.54)
P (Type I error) = α = 0.0055

7
(c) Find P (Type II error) for p = 1/2. Enter the answer correct to four decimal
places.
[0.0024, 0.0035]

Solution:
P (Type II error) = β = P (T ≤ c ∼ p = 0.5)
β = P (T ≤ 36 ∼ p = 0.5)
!
36 − 100p
β=P z≤ p
100p(1 − p)
!
36 − 100(0.5)
β=P z≤ p
100 × 0.5(0.5)
 
−14
β=P z≤ √
25
 
−14
β = Fz
5
β = Fz (−2.8)
P (Type II error) = β = 0.0025

6. The proportion of adults living in a small town who are college graduates is estimated
to be p = 0.6. To test this hypothesis against the alternative p < 0.6, you decide to
take a sample of adults from the town.

(a) What sample size do you need for a test (against the alternative hypothesis that
p = 0.4) at a significance level of 0.10 and power of 0.90?
40
Solution:
Null hypothesis, H0 : p = 0.6
Alternate hypothesis, HA : p < 0.6
Given, α = 0.10 and power 1 − β = 0.90
Test statistic, T = Binomial(n, p) which can be normally approximated as
T ≈ Normal(np, np(1 − p))
 
p(1 − p)
X ≈ Normal p,
n
Test: Reject H0 if T < c

8
α = P (X̄ < c|p = 0.4)
 
 X̄ − p c−p 
α=P r
 p(1 − p) < r 
p(1 − p) 
n n
 
 c−p 
α=P z < r 
 p(1 − p) 
n
 
 c − 0.6 
α=P z < r 
 0.6 × 0.4 
n
 
c − 0.6 
⇒ 0.10 = Fz  q
0.24
n
r
0.24 −1
⇒ c = 0.6 + F (0.10)
n z
0.6278
c = 0.6 − √ (3)
n
Now, power
1 − β = P (X̄ < c | p = 0.6)
 
 X̄ − p c−p 
1−β =P 
 r p(1 − p) < r p(1 − p) 

n n
 
 c−p 
1−β =P  z < r 
 p(1 − p) 
n
 
 c − 0.4 
1−β =P  z < r 
 0.4 × 0.6 
n
 
c − 0.4 
⇒ 0.90 = Fz  q
0.24
n

9
r
0.24 −1
⇒ c = 0.4 + F (0.90)
n z
0.6278
c = 0.4 + √ (4)
n
Solving equations (3) and (4),

n ≈ 39.41
n = 40
(b) Find the critical value at a significance level of 0.10. Enter the answer correct to
two decimal places.
[0.48, 0.52]
Solution:
Substitute n = 40 in equation (3),

c = 0.5
7. A random sample of 36 packets of marshmallow weighs, on average, 145 grams with
a standard deviation of 5 grams. Test the hypothesis that µ = 150 grams against the
alternative hypothesis, µ < 150 grams, at the 0.05 level of significance.
(a) On average, it weighs less than 150 grams.
(b) On average, it weighs 150 grams.
Solution:
The null and the alternative hypothesis are:
H0 : µ = 150
HA : µ < 150
Test: Reject H0 , if X < c.
Given α = 0.05, we have
α = P (X < c | µ = 150)
 
X − 150 c − 150
=⇒ 0.05 = P √ < √
5/ 36 5/ 36
 
c − 150
=⇒ 0.05 = FZ √
5/ 36
c − 150
=⇒ −1.64 = √ =⇒ c = 148.63
5/ 36

Since X = 145 < c, reject H0 .

10
8. A survey of 225 randomly selected students from a city revealed that 89.4% of them
have participated in extra curricular activities in their schools. Can we conclude at
1% level of significance that 90% of the students have participated in extra curricular
activities?

(a) Yes
(b) No

Solution:
Null hypothesis, H0 : p = 0.9
Alternate hypothesis, HA : p 6= 0.9
Given, α = 0.1 and n = 225

α = P (| X̄ − p |> c | µ = 0.9)
 
 X̄ − 0.9 c 
α=P| r |> r 
 0.9 × 0.1 0.9 × 0.1 
225 225
 
 c 
α=P | z |> r 
 0.9 × 0.1 
225
 
−15c
0.1 = 2Fz √
0.9 × 0.1

0.9 × 0.1
c=− × Fz−1 (0.05) = 0.03289
15
Since | X̄ − p |=| 0.894 − 0.9 |< c, z-test at significance level (α) = 0.05, will accept H0 .

9. A box of a certain brand of washing powder advertises that it weighs 2.5 kg, but the
actual weight is 2.4 kg with a standard deviation of 0.1 kg. The company wants to
test if the mean has changed. They take a random sample of 100 boxes and finds that
the average weight is 2.35 kg.

(a) Define null hypothesis and alternative hypothesis.


i. H0 : µ = 2.4, HA : µ 6= 2.4
ii. H0 : µ = 2.4, HA : µ > 2.4
iii. H0 : µ = 2.4, HA : µ < 2.4

11
iv. H0 : µ = 2.5, HA : µ 6= 2.5
(b) What conclusion should be made using a significance level of α = 0.05?
i. Accept H0 .
ii. Reject H0 and accept HA .
Solution:
The company wants to check if the mean has changed. So, null and alternative
hypothesis are given by
6 2.4
H0 : µ = 2.4, µ =
Define a test statistic T as T = X.

Test: reject the null hypothesis if |X − 2.4| > c.

X − 2.4 X − 2.4
By CLT, we can say that √ = ∼ Normal(0, 1).
0.1/ 100 1/100
Now,

α = P (|X − 2.4| > c)


 
X − 2.4 c
=⇒ 0.05 = P >
1/100 1/100
=⇒ 0.05 = P (|Z| > 100c)
=⇒ 0.05/2 = P (Z < −100c)
=⇒ −1.96 = −100c =⇒ c = 0.0196

Since |X − 2.4| = |2.35 − 2.4| = 0.05 > c, reject H0 .

10. It is claimed that the lifetimes of light bulbs are normally distributed with a mean of
800 hours and a standard deviation of 40 hours. We wish to test the hypothesis that
µ = 800 hours against the alternative that µ 6= 800 hours with a sample size of 30.

(a) If the acceptance region is defined as 780 ≤ X̄ ≤ 820, find the significance level.
Enter the answer correct to three decimal places.
0.006, [0.005, 0.008]
Solution:
Let the random variable X denote the lifetime of electric bulbs.
Given, X ∼ Normal(µ, 402 ).
X1 , . . . , X30 ∼ iid X.
For 30 iid samples of X, X̄ ∼ Normal(µ, 402 /30)
Null hypothesis, H0 : µ = 800
Alternate hypothesis, HA : µ 6= 800

12
α = P (Reject H0 | H0 is true)
α = P (X̄ > 820 or X̄ < 780 | µ = 800)
α = P (| X̄ − 800 |> 20)
!
X̄ − 800 20
α=P | p |> p
1600/30 1600/30
!
20
α = P | z |> p
1600/30
!
−20
α = 2Fz p
1600/30
α = 0.006
(b) Find the power of the test against the alternative that if the true mean life is 788
hours. Enter the answer correct to two decimal places.
[0.91, 0.94]
Solution:

Power = 1 − β = P (Reject H0 | HA is true)


1 − β = P (X̄ > 820 or X̄ < 780 | µ = 788)
1 − β = P (X̄ > 820 | µ = 788) + P (X̄ < 780 | µ = 788)
! !
X̄ − 788 820 − 788 X̄ − 788 780 − 788
1−β =P p >p +P p <p
1600/30 1600/30 1600/30 1600/30
! !
32 −8
1−β =P z > p +P z < p
1600/30 1600/30
! !
32 −8
1 − β = 1 − Fz p + Fz p
1600/30 1600/30
1 − β = 0.1366

13
Statistics for Data Science - 2
Week 10 practice Assignment
Hypothesis testing

1. Consider nine samples from Normal(100, 22 ). Let we wish to test H0 : µ = 100 against
HA : µ 6= 100.

(i) If the acceptance region is defined as 98.5 ≤ X ≤ 101.5, find the significance level.
Write your answer correct to two decimal places.
(Use P (−2.25 < Z < 2.25) = 0.975)

Solution:
Given that

H0 : µ = 100, HA : µ 6= 100
The acceptance region is defined as 98.5 ≤ X ≤ 101.5.
Now,

α = P (reject H0 |H0 is true)


= P ((X > 101.5 or X < 98.5)|µ = 100)
= P (|X − 100| > 1.5)
 
X − 100 1.5
=P 2/3
> 2
/3
= P (|Z| > 2.25)
= 1 − P (−2.25 < Z < 2.25)
= 1 − 0.975 = 0.02

(ii) Find the power of the test against an alternative that the mean is 103. Write your
answer correct to two decimal places.
(Use P (−6.75 < Z < −2.25) = 0.012)

Solution:
1 − β = P (reject H0 |HA is true)
= P ((X > 101.5 or X < 98.5)|µ = 103)
= P (X < 98.5) + P (X > 101.5)
= P (X − 103 < −4.5) + P (X − 103 > −1.5)
   
X − 103 −4.5 X − 103 −1.5
=P 2/3
< 2 +P > 2
/3 2/3 /3
= P (Z < −6.75) + P (Z > −2.25)
= 1 − P (−6.75 < Z < −2.25)
= 1 − 0.012 = 0.98

2. Air crew escape systems are powered by a solid propellant. The mean burning rate of
this propellant must be 50 centimeters per second. We know that the standard deviation
of burning rate is σ = 2 centimeters per second. An engineer suspects that the mean
burning rate is greater than 50. The engineer decides to test at a significance level of
0.05 and selects a random sample of n = 25 and obtains a sample average burning rate
of 51.3 centimeters per second.

(i) Define null hypothesis and alternative hypothesis.


(a) H0 : µ = 50, HA : µ 6= 50
(b) H0 : µ = 50, HA : µ < 50
(c) H0 : µ = 50, HA : µ > 50
(d) H0 : X = 50, HA : X > 50
Solution:
Since, the mean burning rate of the propellant must be 50 centimeters per second
and engineer suspects that the mean burning rate is greater than 50. Therefore,
null and alternative hypothesis will be

H0 : µ = 50, HA : µ > 50

(ii) What is the critical value (c) if the acceptance region is X ≤ c? Write your answer
correct to two decimal places.
(use: FZ (1.64) = 0.95)

Solution:
If the significance level of the test is 0.05, then

Page 2
P (reject H0 |H0 is true) = 0.05
⇒P (X > c|µ = 50) = 0.05
⇒P (X − 50 > c − 50) = 0.05
 
X − 50 c − 50
⇒P 2/5
> 2 = 0.05
/5
 
c − 50
⇒P Z > 2 = 0.05
/5
c − 50
⇒1 − FZ ( 2 ) = 0.05
/5
c − 50
⇒FZ ( 2 ) = 0.95
/5
c − 50
⇒ 2 = 1.64
/5
2
⇒c = 50 + (1.64)
5
⇒c = 50.65

(iii) What conclusions should be drawn from the selected sample?


(a) The mean burning rate of the propellant is 50.
(b) The mean burning rate of the propellant is greater than 50.
(c) The mean burning rate of the propellant is lesser than 50.
(d) No conclusion can be drawn from the given sample.
Solution:
Given that X = 51.3
We will reject H0 if X > 50.65 and X = 51.3 > 50.65, we will reject the null
hypothesis.
It implies that the mean burning rate of the propellant is greater than 50.

3. Suppose a manufacturer of memory chips observes that the probability of chip failure
is p = 0.05. A new procedure is introduced to improve the design of chips and lower
the probability of chip failure. To test this new procedure, 200 chips are produced using
this new procedure and tested. We would accept the new procedure if the total number
of failed chips is less than 5 out of 200. Find the significance level of the test. Use the
normal approximation. Write your answer correct to three decimal places.
(Use P (Z < −1.62) = 0.052)
Solution:
A new procedure is introduced to improve the design of chips and lower the probability
of chip failure. Therefore, null and alternative hypothesis will be

H0 : p = 0.05, HA : p < 0.05

Page 3
Define a test statistic T as T = number of failed chips out of 200.

Given that: We would accept the new procedure if the total number of failed chips is
less than 5 out of 200.

It implies that we will reject the null hypothesis if T < 5.

Notice that T ∼ Binomial(200, p).


When the null hypothesis is true, E[T ] = 200p = 200(0.05) = 10 and
Var(T ) = 200p(1 − p) = 200(0.05)(0.95) = 9.5

By CLT, we can say that


T − 10
√ ∼ normal(0, 1)
9.5
Now, significance level is given by

α = P (reject H0 |H0 is true)


= P (T < 5)
 
T − 10 5 − 10
=P √ < √
9.5 9.5
= P (Z < −1.62)
= FZ (−1.62)
= 0.052

4. The mean lifetime of a sample of 100 light bulbs produced by a company is computed
to be 1570 hours with a standard deviation of 120 hours. µ is the mean lifetime of all
the bulbs produced by the company,

(i) Test the hypothesis µ = 1600 against the alternative hypothesis µ 6= 1600 at a level
of significance of 0.05.
(a) Reject the null hypothesis
(b) Accept the null hypothesis
Solution:
Given that
H0 : µ = 1600, HA : µ 6= 1600
Define a test statistic T as T = X.
Test: reject H0 if |X − 1600| > c Notice that when null hypothesis is true, we have

X − 1600
120/10
∼ Normal(0, 1)

Page 4
Now,

α = P (reject H0 |H0 is true)


⇒P (|X − 1600| > c) = 0.05
 
X − 1600 c
⇒P 120/10
> 120 = 0.05
/10
 c
⇒P |Z| > = 0.05
 12 
−c
⇒2P Z < = 0.05
12
 
−c
⇒FZ = 0.025
12
−c
⇒ = −1.96
12
⇒c = 12(1.96) = 23.52

It implies that we will reject the null hypothesis if |X − 1600| > 23.52

Given that X = 1570


⇒ |X − 1600| = |1570 − 1600| = 30 > 23.52
Therefore, we will reject the null hypothesis.

(ii) Find the P -value. Write your answer correct to three decimal places.
Solution:
P -value is the minimum significance level at which null hypothesis is rejected for
the observed test statistic value.
Therefore, P -value is given by

α = P (|X − 1600| > |1570 − 1600|)


= P (|X − 1600| > 30)
 
X − 1600 30
=P 120/10
> 120
/10
= P (|Z| > 2.5)
= 2P (Z < −2.5)
= 2(0.0062) = 0.012

5. The average IQ of the students of a school is reported to be 107 with a standard deviation
of 4. You suspect that the average may be higher, possibly 110, and decide to sample
students to find their IQs. What sample size do you need for a test at the significance
level 0.05 and power 0.95?
(Use: FZ (1.64) = 0.95 and FZ (−1.64) = 0.05)

Page 5
Solution:
According to the question, we have

H0 : µ = 107, HA : µ > 107

Define a test statistic T as T = X.


Test: reject H0 if X > c.
Notice that when null hypothesis is true, we have

X − 107
4/√n
∼ Normal(0, 1)

Now, the significance level of the test is given to be 0.05. It implies that

P (reject H0 |H0 is true) = 0.05


⇒P (X > c) = 0.05
 
X − 107 c − 107
⇒p 4/√n
> 4√ = 0.05
/ n
 
c − 107
⇒P Z > 4 √ = 0.05
/ n
 
c − 107
⇒1 − P Z ≤ 4 √ = 0.05
/ n
 
c − 107
⇒P Z ≤ 4 √ = 0.95
/ n
c − 107
⇒ 4√ = 1.64
/ n
4
⇒c = 107 + (1.64) √ ...(1)
n

Again, when alternative hypothesis is true, we have

X − 110
4/√n
∼ Normal(0, 1)

Now, the power of the test is given to be 0.95. It implies that

Page 6
1 − β =P (reject H0 |HA is true) = 0.95
⇒P (X > c) = 0.95
 
X − 110 c − 110
⇒p 4/√n
> 4√ = 0.95
/ n
 
c − 110
⇒P Z > 4 √ = 0.95
/ n
 
c − 110
⇒1 − P Z ≤ 4 √ = 0.95
/ n
 
c − 110
⇒P Z ≤ 4 √ = 0.05
/ n
c − 110
⇒ 4√ = −1.64
/ n
4
⇒c = 110 − (1.64) √ ...(2)
n

From equation (1) and (2), we have


4 4
107 + (1.64) √ = 110 − (1.64) √
n n
4
⇒2(1.64) √ = 3
n
√ 2 × 1.64 × 4
⇒ n= = 4.37
3
⇒n = 19.12
⇒n = 20

6. An instructor gives a quiz involving 10 true-false questions. To test the hypothesis that
the student is guessing, the following decision rule is decided: (i) If 7 or more are correct,
the student is not guessing; (ii) if fewer than 7 are correct, the student is guessing. Find
the significance level of the test. Write your answer correct to two decimal places.
(Hint: If student is guessing then, probability of getting a question correct is p = 0.5)
Solution:
If a student is guessing the answer then, each question is equally likely to get corrected
that is p = 0.5 but if student is not guessing the answer then, probability of getting the
question correct is more than 0.5 that is p > 0.5.
It implies that
H0 : p = 0.5, HA : p > 0.5
Define a test statistic T as T = number of correct answers out of ten.

Page 7
As per the given information, we will reject the null hypothesis if T ≥ 7

Notice that if null hypothesis is true then, T ∼ Binomial(10, 0.5).

Now,

α = P (reject H0 |H0 is true)


= P (T ≥ 7)
10
X
10
= Ci (0.5)10
i=7
= ( C7 + 10 C8 + 10 C9 + 10 C10 )(0.5)10
10

= (120 + 45 + 10 + 1)(0.00097)
= 0.17

7. A cricket ball production line must produce of balls weights 163 g with a standard
deviation of 4 g in order to get top rating. To test the hypothesis of mean weights of the
balls to be 163, a sample of 16 balls are considered. If we want 0.01 level of significance,
what will be the acceptance region?
(Use FZ (2.57) = 0.995)

(a) [162.43, 164.57]


(b) [158.13, 166.57]
(c) [160.43, 165.57]
(d) [162.13, 164.98]

Solution:
Since, a cricket ball production line must produce of balls weights 163 g, null and alter-
native hypothesis are given by

H0 : µ = 163, HA : µ 6= 163

Define test statistic T as T = X.

Test: reject the null hypothesis if |X − 163| > c.

X − 163
Notice that when null hypothesis is true, 4/4
= X − 163 ∼ Normal(0, 1)

Now, the significance level of the test is given to be 0.01. It implies that

Page 8
P (reject H0 |H0 is true) = 0.01
⇒P (|X − 163| > c) = 0.01
⇒P (|Z| > c) = 0.01
⇒2P (Z < −c) = 0.01
⇒FZ (−c) = 0.005
⇒ − c = −2.57
⇒c = 2.57

Therefore, acceptance region will be [163 − 2.57, 163 + 2.57] = [160.43, 165.57].

8. A researcher has recently come into contact with a number of left-handed artists and
wonders whether artists are more likely to be left-handed than peoples in the general
population. She selects a random sample of 150 members of the Artists and asks each
whether they are left-handed or not. The sample proportion (who are left-handed) is
0.15. Suppose that 10% of people are left-handed in the general population.

(i) Does the data provide strong evidence that artists are more likely than the general
public to be left-handed if she decides a significance level of 0.05?
(a) Yes
(b) No
Solution:
10% of people are left-handed in the general population but a researcher wonders
whether artists are more likely to be left-handed. So, probability of an artist being
left-handed will be more than 0.1. Therefore, null and alternative hypothesis are
given by
H0 : p = 0.1, HA : p > 0.1
X1 + X2 + . . . + X150
Define a test statistic T as T = X = , where each Xi ∼
150
Bernoulli(0.1) (If null hypothesis is true).
p(p − 1) (0.1)(0.9) 0.09
Therefore, E[X] = p = 0.1 and Var(X) = = =
n 150 150
X − 0.1
Then, by CLT p ∼ Normal(0, 1).
0.09/150

Test: reject H0 if X > c.


Now, the significance level of the test is given to be 0.05. It implies that

Page 9
P (reject H0 |H0 is true) = 0.05
⇒P (X > c) = 0.05
!
X − 0.1 c − 0.1
⇒P p >p = 0.05
0.09/150 0.09/150
!
c − 0.1
⇒P Z > p = 0.05
0.09/150
!
c − 0.1
⇒1 − P Z ≤ p = 0.05
0.09/150
!
c − 0.1
⇒FZ p = 0.95
0.09/150

0.3
⇒c = 0.1 + (1.64) √
150
⇒c = 0.14

Since, X = 0.15 > 0.14, we will reject the null hypothesis. It implies that artists
are more likely than the general public to be left-handed if she decides a significance
level of 0.05.

(ii) Find the P -value. Write your answer correct to three decimal places.

Solution:
P -value is the minimum significance level at which null hypothesis is rejected for
the observed test statistic value.
Therefore, P -value is given by

α = P (X > 0.15
= P (X − 0.1 > 0.15 − 0.1)
!
X − 0.1 0.05
=P p >p
0.09/150 0.09/150

= P (Z > 2.04)
= P (Z < −2.04)
= 0.02

9. A cereal manufacturer tests its equipment weekly to be assured that the correct weight of
cereal is in each box. The company wants to test if the weight differs from the expected
weight. The weight of each box is expected to be 500g with a standard deviation of
100g. The manufacturer takes a random sample of 100 boxes and finds that the average

Page 10
weight is 520g. What is the sample’s P -value? Write your answer correct to two decimal
places.
(Use FZ (−2) = 0.022)

Solution:
The company wants to test if the weight differs from the expected weight and the weight
of each box is expected to be 500g. So, null and alternative hypothesis are given by

H0 : µ = 500, µ 6= 500

Define a test statistic T as T = X.

Test: reject the null hypothesis if |X − 500| > c.

X − 500 X − 500
By CLT, we can say that 100/√100
= ∼ Normal(0, 1).
10
P -value is the minimum significance level at which null hypothesis is rejected for the
observed test statistic value.
Therefore, P -value is given by

α = P (|X − 500| > |500 − 520|)


= P (|X − 500| > 20)
 
X − 500
=P >2
10
= P (|Z| > 2)
= 2P (Z < −2)
= 2(0.022) = 0.04

10. A machine produces iron rods of mean weight 12kg with a standard deviation of 2kg. An
engineer suspects that average weight is less than 12kg, probably 10kg. So, he collects
the weights of n iron rods. He wants the significance level to be less than 10−4 and
probability of type two error to be less than 10−8 .
(use FZ (−3.74) = 10−4 and FZ (5.61) = 1 − 10−8 )

(i) Find the required sample size.

Solution:
According to the question, we have

H0 : µ = 12, HA : µ < 12

Define a test statistic T as T = X.

Page 11
Test: reject H0 if X < c.
Notice that when null hypothesis is true, we have

X − 12
2/√n
∼ Normal(0, 1)

Now, the significance level of the test is given to be less than 10−4 . It implies that

P (reject H0 |H0 is true) ≤ 10−4


⇒P (X < c) ≤ 10−4
 
X − 12 c − 12
⇒p 2/√n
< 2√ ≤ 10−4
/ n
 
c − 12
⇒P Z < 2 √ ≤ 10−4
/ n
c − 12
⇒ 2 √ ≤ −3.74
/ n
2
⇒c ≤ 12 − (3.74) √ ...(1)
n

Again, when alternative hypothesis is true, we have

X − 10
2/√n
∼ Normal(0, 1)

Now, probability of type two error to be less than 10−8 . It implies that

β =P (accept H0 |HA is true) ≤ 10−8


⇒P (X ≥ c) ≤ 10−8
 
X − 10 c − 10
⇒p 2/√n
≥ 2√ ≤ 10−8
/ n
 
c − 10
⇒P Z ≥ 2 √ ≤ 10−8
/ n
 
c − 10
⇒1 − P Z < 2 √ ≤ 10−8
/ n
 
c − 10
⇒P Z < 2 √ ≥ 1 − 10−8
/ n
c − 10
⇒ 2 √ ≥ 5.61
/ n
2
⇒c ≥ 10 + (5.61) √ ...(2)
n

Page 12
From equation (1) and (2), we have
2 2
12 − (3.74) √ = 10 + (5.61) √
n n
2
⇒(5.61 + 3.74) √ = 2
n

⇒ n = 9.35
⇒n = 87.42
⇒n = 88

(ii) Find the critical value (for the acceptance region to be defined as X ≥ c, where
X is the mean weight of the rods). Write your answer correct to two decimal places.

Solution:
Putting the value of n in the equation (1), we have
2
c ≤ 12 − (3.74) √
88
⇒c ≤ 11.20 ...(3)

Putting the value of n in the equation (2), we have


2
c ≥ 10 + (5.61) √
88
⇒c ≥ 11.19 ...(4)

From the equation (3) and (4), we have c = 11.19

Page 13
Statistics for Data Science - 2

Week 11 Graded Assignment Solution

1. The IQs (intelligence quotients) of 25 students from one batch of IITM students showed
a mean of 110 with a standard deviation of 8, while the IQs of 25 students from another
batch of IITM students showed a mean of 115 with a standard deviation of 7. Is there
a significant difference between the IQs of the two groups at a 0.05 level of significance?

a) Yes
b) No

Hint: Use FZ−1 (0.025) = −1.96


Solution:
Let Xi and Yi represent the IQ’s of both batch of students.
X1 , X2 , . . . , X25 ∼ N(µ1 , 82 ) and Y1 , Y2 , . . . , Y25 ∼ N(µ2 , 72 )
X = 110 and Y = 115
Consider, H0 : µ1 = µ2 , HA : µ1 6= µ2
T = X − Y ∼ N(µ1 − µ2 , 64 25
+ 49
25
) i.e. N(µ1 − µ2 , 113 25
)
Test: Reject H0 if |T | > c.
!
T c
α = P (|T | > c | H0 ) = P |p |> p
113/25 113/25
! !
c −c
=P |Z| > p = 2FZ p
113/25 113/25
q
⇒ c = − 113 FZ−1 (α/2)
q 25
⇒ c = − 113 FZ−1 (0.025)
q 25
⇒ c = − 113
25
× (−1.96) = 4.167
Since, |X − Y | = |110 − 115| = 5 > 4.167
Therefore, we will reject H0 .
This implies that there a significant difference between the IQs of the two groups at a
0.05 level of significance.

2. A sociologist focusing on popular culture and media believes that the average number
of hours per week (hrs/week) spent on social media is different for men and women.
The researcher knows that the standard deviations of amount of time spent on social
media are 5 hrs/week and 6 hrs/week for men and women, respectively. Examining two
independent random samples of 64 individuals each, if the average number of hrs/week

1
spent on social media for the sample of men is 1.5 hours greater than that for the sample
of women, what conclusion can be made from a hypothesis test where, H0 : µM = µW
and HA : µM 6= µW ? Take α = 0.05.

a) Reject H0
b) Accept H0

Solution:
Let Xi and Yi represent the average number of hrs/week spent on social media by men
and women respectively.
X1 , X2 , . . . , X64 ∼ N(µ1 , 52 ) and Y1 , Y2 , . . . , Y64 ∼ N(µ2 , 62 )
|X − Y | = 1.5
Consider, H0 : µ1 = µ2 , HA : µ1 6= µ2
T = X − Y ∼ N(µ1 − µ2 , 25 64
+ 36
64
) i.e. N(µ1 − µ2 , 61 64
)
Test: Reject H0 if |T | > c.
!
T c
α = P (|T | > c | H0 ) = P |p |> p
61/64 61/64
! !
c −c
=P |Z| > p = 2FZ p
61/64 61/64
q
⇒ c = − 61 F −1 (α/2)
64 Z
q
⇒ c = − 61 FZ−1 (0.025)
q 64
⇒ c = − 61
64
× (−1.96) = 1.913
Since, |X − Y | = 1.5 < 1.913
Therefore, we will accept H0 .

3. An IITM instructor conducts two live sessions for two different classes, call it A and
B, in Statistics. Session A had 25 students attending while session B had 36 students.
The instructor conducted a test for the two sessions. Although there was no significant
difference in mean grades, session A had a standard deviation of 10 while session B had
a standard deviation of 14. Can we conclude at the 0.01 level of significance that the
variability in marks of class B is greater than that of A?

a) Yes
b) No

Hint: Use FF−1(35,24)


(0.99) = 2.529
Solution:
H0 : σ1 = σ2 , HA : σ1 < σ2

2
SB2
Test: Reject H0 if > 1 + cR
SA2
SB2
We know that, ∼ F (n2 − 1, n1 − 1)
SA2
n1 = 25, n2 = 36
S2
⇒ B2 ∼ F (35, 24)
SA
Therefore,

α = 1 − FF (35,24) (1 + cR )
⇒ 1 + cR = FF−1(35,24) (1 − α) = FF−1(35,24) (0.99)
⇒ 1 + cR = 2.529

SB2 2
Since, 2
= 14
102
= 1.96 < 2.529
SA
Therefore, we will accept H0 .
This implies that at the 0.01 level of significance the variability in marks of class B is
not greater than that of A.

4. The manufacturer of a new car claims that a typical car gets a mileage of 40 kilometres
per litre. We think that the mileage is less. To test our suspicion, we perform the
hypothesis test with H0 : µ = 40 and HA : µ < 40. Suppose we take a random sample of
900 new cars and find that their average mileage is 39.8 kilometres per litre and sample
standard deviation is 2, what does a t-test say about a null hypothesis with a significance
level of 0.05?

a) Reject H0
b) Accept H0

Hint: Use Ft−1


899
(0.05) = −1.646
Solution:
Null hypothesis, H0 : µ = 40
Alternate hypothesis, HA : µ < 40
Test: Reject H0 if X < c
Given, α = 0.05 and X = 39.8
In this problem, we do not know the population variance, σ 2 .
The sample variance S 2 = 22

α = P (X < c|µ = 40)


!
X − 40 c − 40
α=P p <p
S 2 /n S 2 /n

3
!
X − 40 c − 40
α=P p <p
4/900 4/900
!
c − 40
α = Ft899 p
4/900
!
c − 40
0.05 = Ft899 p
4/900
r
4 −1
c = 40 + F (0.05)
900 t899
c = 39.89
Since, X < c, reject H0 .

5. The standard deviation of weights of 70 gram bags of white cheddar popcorn is expected
to be 2.5 grams. A random sample of 20 packages showed a standard deviation of 3
grams. Is the apparent increase in variability significant at the 0.05 level.?

a) Yes
b) No

Hint: Use Fχ−1


2 (0.95) = 30.14
19
Solution:
As per given information, the null and alternative hypothesis are given by

H0 : σ = 2.5, HA : σ > 2.5

Define a test statistic T as T = S 2 .


(n − 1)S 2 19S 2
We know that 2
= 2
∼ χ219 .
σ 2.5
Test: reject the null hypothesis if S 2 > c2 .
If the significance level of the test is 0.05, then

4
P (S 2 > c2 ) = 0.05
19S 2 19c2
 
⇒P > = 0.05
2.52 2.52
19c2
 
2
⇒P χ19 > = 0.05
2.52
19c2
 
2
⇒1 − P χ19 < = 0.05
2.52
19c2
⇒ 2 = 30.14
2.5
6.25 × 30.14
⇒c2 = = 9.91
19
Since S 2 = 9 < 9.91, we will not reject the null hypothesis.
Therefore, the apparent increase in variability is not significant at the 0.05 level.

6. Independent random samples of ceramic produced by two different processes were tested
for hardness. The results are:

Process 1 Process 2
8.5 9.0
9.5 9.5
8.0 10.5
9.0 9.5
10.0 10.0
9.5 9.0
10.5 9.0
10.0 9.5

Table 11.1.G

Can we conclude at 5% level of significance that the variances in hardness are equal?

a) Yes
b) No

Hint: Use FF−1(7,7)


(0.025) = 0.2
Solution:
Let Process 1 and Process 2 values denoted by Xi and Yi respectively.
H0 : σ1 = σ2 , HA : σ1 6= σ2
S2 S2
Test: Reject H0 if X2 > 1 + cR or X2 < 1 − cL
SY SY

5
2
SX
We know that, ∼ F (n1 − 1, n2 − 1)
SY2
n1 = 8, n2 = 8
S2
⇒ X2 ∼ F (7, 7)
SY
Therefore,

α/2 = FF (7,7) (1 − cL )
⇒ 1 − cL = FF−1(7,7) (α/2) = FF−1(7,7) (0.025)

⇒ 1 − cL = 0.2
S2
Since, X2 = 0.2857
0.6964
= 2.437 > 0.2
SY
Similarly we can check for other condition.

α/2 = 1 − FF (7,7) (1 + cR )
⇒ 1 + cR = FF−1(7,7) (1 − α/2) = FF−1(7,7) (0.975)

⇒ 1 + cR = 4.99
S2
Since, X2 = 0.2857
0.6964
= 2.437 < 4.99
SY
Therefore, we will accept H0 .

7. Let X ∼ Normal(0, σ 2 ). Consider the test H0 : σ = 4 against HA : σ = 5. A sam-


ple X1 , X2 , . . . , X10 is observed. What is the likelihood ratio function for the observed
sample?
 10  10

5 9 P
a) exp X2
4 800 i=1 i
 10  10

4 9 P 2
b) exp − X
5 800 i=1 i
 10  10

5 9 P 2
c) exp − X
4 800 i=1 i
 10  10

4 9 P 2
d) exp X
5 800 i=1 i
Solution:
Given X ∼ Normal(0, σ )
The null and alternative hypothesis are -

H0 : σ = 4

HA : σ = 5

6
Samples X1 , . . . , X10 are observed.
Now, likelihood ratio,
10
Normal(0, 52 )
Q

L = i=1
10
Q
Normal(0, 42 )
i=1
 i=10 
P 2
 i=1 Xi 
 10
1
− 50 
exp  
5
=  i=10 
P 2
 50 X
 i=1 i 
1
− 32 
exp  
4

 10 i=10  !
4 X −1 1
= exp Xi2 +
5 i=1
50 32
 10 i=10
!
4 9 X 2
= exp X
5 800 i=1 i

8. A random number generator is expected to produce digits 0, 1, 2, . . . , 9 uniformly at


random. In a sample of 250 digits, the observed frequencies are given below.

Digit Observed frequency Expected frequency


0 17 25
1 31 25
2 29 25
3 18 25
4 14 25
5 20 25
6 35 25
7 30 25
8 20 25
9 36 25

Table 11.2.G

Is the above a good-enough fit at a significance level of 0.05?

a) Yes
b) No

7
Hint: Use Fχ−1
2 (0.95) = 16.9
9
Solution:
Value of the test statistic T is given by

(17 − 25)2 (31 − 25)2 (29 − 25)2 (18 − 25)2 (14 − 25)2
T = + + + +
25 25 25 25 25
2 2 2 2
(20 − 25) (35 − 25) (30 − 25) (20 − 25) (36 − 25)2
+ + + + +
25 25 25 25 25
582
=
25
= 23.28

Test: Reject H0 if T > c.


We know that α = P (T > c | H0 ) ≈ 1 − Fχ2k−1 (c)
⇒ c = Fχ−1
2 (1 − α)
k−1
Here k = 10, α = 0.05
⇒ c = Fχ−1 −1
2 (1 − 0.05) = Fχ2 (0.95) = 16.9
9 9
Since T > c, we will reject null hypothesis.
This implies that the above data is not good-enough fit at a significance level of 0.05.

9. On two major e-commerce websites A and B the sales on a particular day is given as a
contingency table [Table 11.3.G].

A B Total
Bought item 1000 1500 2500
Didn’t buy item 1200 2000 3200
Total 2200 3500 5700

Table 11.3.G: E-commerce sales data

Can we say that the sales is independent of websites at a significance level of 0.05?

a) Yes
b) No

Hint: Use Fχ−1


2 (0.95) = 3.84
1
H0 : Joint PMF is product of marginals
HA : Joint PMF is not the product of marginals
2500 × 2200
Number of people bought items via website A = = 964.91
5700
2500 × 3500
Number of people bought items via website B = = 1535.08
5700
8
3200 × 2200
Number of people did not buy items via website A = = 1235.08
5700
3200 × 3500
Number of people did not buy items via website B = = 1964.91
5700

Therefore, value of the test statistic T is given by

(1000 − 964.91)2 (1500 − 1535.08)2 (1200 − 1235.08)2 (2000 − 1964.91)2


T = + + +
964.91 1535.08 1235.08 1964.91
= 1.276 + 0.8016 + 0.9963 + 0.6266
= 3.7005

We will reject the null hypothesis if T > c.


At a significance level of 0.05, we have

0.01 = P (T > c)
⇒0.01 = 1 − P (T ≤ c)
⇒P (T ≤ c) = 0.95
⇒Fχ21 (c) = 0.95
⇒c = 3.84

Since T = 3.7005 < 3.84, we will accept the null hypothesis.


This implies that sales is independent of websites at a significance level of 0.05.

9
Statistics for Data Science - 2

Practice assignment week 11

1. 46 people are divided into two groups - experimental and control. The experimental
group is inocculated against a disease while the control group is not. Both the groups
are then exposed to the disease and the data obtained is recorded in the contingency
table given below:

Control group Experimental group Total


Contracted 13 8 21
Not Contracted 10 15 25
Total 23 23 46

Use Chi-squared test to check if the inocculation and the contract of disease are inde-
pendent at a significance level of 5%.

(a) Yes
(b) No

1
2
2. Consider the following cross tabulation of status of completion of courses by learners
across three different websites:

Website A Website B Website C


Completed 182 213 203
Not completed 154 138 110

Are status of completion of course independent of the websites? (Use α = 0.05)

(a) Yes
(b) No

3
3. Suppose that a sample of 150 electron tubes are tested and the following summary of
their life length (in hours), T is reported:

Life length 0 ≤ T < 100 100 ≤ T < 200 200 ≤ T < 300 T > 300
Number of electron tubes 47 40 35 28

The sample mean is recorded to be 200 hours. Test the hypothesis that T is exponen-
tially distributed at a significance level of 0.01 using Chi-squared test.

(a) Accept the hypothesis that T is exponentially distributed.


(b) Reject the hypothesis that T is exponentially distributed.

4
5
4. The number of accidental deaths in a country are tabulated each day for a specified
period of 400 days, along with expected frequencies according to a Poisson fit.

Number of deaths 0 1 2 3 4+
Observed frequency 1448 805 206 34 7
Expected frequency 1450 775 200 25 50

Use Chi-squared test to check if the above fit is acceptable at a significance level of
5%.

(a) Yes
(b) No

6
5. Let X1 , . . . , Xn ∼ Normal(0, σ 2 ), and consider testing H0 : σ = 1 versus HA : σ = 2.
Find the likelihood ratio of the observed samples 2, 3, 1, 4.4, 5, 5, 3.6, 6, 4, 6.

(a) 1.02 × 10−27


(b) 1.02 × 1027
(c) 0.02 × 1027
(d) 0.02 × 10−27

7
6. A random sample of 40 fibres manufactured using process A has a mean length of 16.7
cm, and standard deviation of 0.5 cm. A random sample of 60 fibres manufactured
using process B has mean length of 16.4 cm and standard deviation of 0.6 cm. Test
the hypothesis that the mean length of fibres manufactured using process A and B are
the same.

(a) Identify the null and alternative hypothesis.


i. H0 : µ1 = µ2 , HA : µ1 < µ2
ii. H0 : µ1 = µ2 , HA : µ1 > µ2
iii. H0 : µ1 = µ2 , HA : µ1 6= µ2
iv. µ1 − µ2 = 0.3 cm, HA : µ1 − µ2 6= 0.3 cm
(b) Choose the correct options from the following:
i. Reject H0 at a significance level of 0.05.
ii. Accept H0 at a significance level of 0.05.

8
9
7. One wants to check if the average IQ of girls and boys are the same. It is known
that the IQ’s of both boys are girls have a standard deviation of 10. Mean IQ of 200
randomly selected boys is 99 and mean 1Q of 300 randomly selected girls is 97. Using
1% level of significance, comment on the IQ’s of girls and boys.
Hint: FZ−1 (−2.58) = 0.005

(a) The average IQ of both boys and girls are the same.
(b) The average IQ of boys are more as compared to the IQ’s of girls.
(c) The average IQ of boys are less as compared to the IQ’s of girls.

10
11
8. The amount of saturated fat present in 100 grams of cheese of two different brands are
measured. The data are as follows:

Brand A Brand B
21 20
19 39
20 24
23 33
22 30
28 28
32 30
19 22
13 33
18 24

Can we conclude at 5% level of significance that the two variances are equal?
Hint: sA = 5.3177, sB = 5.8699, FF (9,9) (0.248) = 0.025

(a) Yes
(b) No

12
13
9. A study is conducted to compare the duration of time a certain dose of pain reliever
works when administered to men and women. Standard deviation for a random sample
of 11 men is found to be 6.1 and, for a random sample of 14 women, it is found to be
5.3. Use a significance level of α = 0.05 to check the hypothesis that the variation in
time of relief is equal for both genders against the alternative that it is larger for men.
Hint: Use FF−1(10,13) (0.95) = 2.67.

(a) Reject H0 .
(b) Fail to reject H0 .

14
15
10. Past experience indicates that the time required for athletes to complete a 200 m race
is a normal random variable with a mean µ = 35 seconds. If a random sample of 20
athletes took an average of 33.1 seconds to complete the race with a standard deviation
of 4.3 seconds, test the hypothesis, at the 0.05 level of significance, that µ = 35 seconds
against the alternative that µ < 35 seconds.

(a) Accept the null hypothesis.


(b) Reject the null hypothesis.

16
11. A company manufactures mobiles chargers with an output voltage of 5V and variance
0.5V2 . The company wants to test the variance. They take a random sample of 12
chargers and the following voltages are obtained:
5.34, 5.65, 4.76, 5.00, 5.55, 5.54, 5.07, 5.35, 5.44, 5.25, 5.35, 4.61
Test the hypothesis that σ 2 = 0.5 at 0.05 level of significance.

(a) Accept H0
(b) Reject H0

17
12. The standard deviation of a component in a drug is expected to be 0.00002 kg. A
pharmacist suspecting the variability to be higher obtains a sample of 8 drugs and
found the sample standard deviation to be 0.00005 kg.

(a) Identify the null and alternative hypothesis:


i. H0 :σ = 0.00002, HA :σ > 0.00002
ii. H0 :σ = 0.00005, HA :σ 6= 0.00005
iii. H0 :σ = 0.00002, HA :σ < 0.00002
iv. H0 :σ = 0.00002, HA :σ 6= 0.00002
(b) What conclusion would a χ2 test reach at a significance level of 0.01?
i. Accept the null hypothesis.
ii. Accept the alternative hypothesis.

18
19

You might also like