0% found this document useful (0 votes)
2 views

sample_questions2

The document is a test paper for the STM4PSD course, consisting of various questions related to probability, statistics, and market basket analysis. It includes instructions for completion, a total of 50 marks available, and specific requirements for answering questions. The test covers topics such as probability mass functions, machine learning model evaluation, and random variable distributions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

sample_questions2

The document is a test paper for the STM4PSD course, consisting of various questions related to probability, statistics, and market basket analysis. It includes instructions for completion, a total of 50 marks available, and specific requirements for answering questions. The test covers topics such as probability mass functions, machine learning model evaluation, and random variable distributions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

STM4PSD Test (sample questions)

Name:

Student Number:

Instructions

• You have 90 minutes to complete the test.

• A total of 50 marks is available.

• Write all answers in the provided space.

• Unless otherwise stated, give all answers to 3 decimal places.

• Answers without working will be given zero marks.

• If you finish early, raise your hand and the supervisor will come to you.

• Write your name and student number at the top of the page.

Do not open the booklet until instructed to do so.

1
Question 1. 9 marks
A random variable X has the following probability mass function:

x −8 −6 0 4 6 9
P (X = x) 0.259 0.055 0.225 0.082 0.197 0.182

Let A denote the event “X = 0”, let B denote the event “−6 ≤ X < 9”, and let C denote the event
“X > 0”.
Determine each of the following:

(a) (1 mark) E(X)

(b) (1 mark) P (A)

(c) (2 marks) P (B ∪ C)

(d) (2 marks) P (Ac ∩ C c )

(e) (3 marks) P (A | C c )

2
Extra writing space for Question 1

3
Question 2. 14 marks
A medical professional is using a machine learning model to assist in diagnosing an unusual but relatively
harmless disease. The model may produce a positive result or a negative result. If a patient tests positive
for the illness, then they are referred to a specialist for further investigation; otherwise, the patient takes
no further action.
It is known that approximately 4.1% of the population has the illness. After training the model and
running it on test data, the true positive rate for the test is estimated as 75.1%, and the false positive
rate for the test is estimated as 0.8%.
Let T denote the event that a patient tests positive for the illness, and let I denote the event that a
patient has the illness.

(a) (1 mark) From the information above, state the probabilities P (T | I), P (T | I c ) and P (I).

(b) (2 marks) Use the Law of Total Probability to determine P (T ).

(c) (2 marks) Determine P (I | T ) and P (I c | T c ).

4
(d) (1 mark) What names are given to the probabilities you calculated in (c)?

(e) (3 marks) Determine the false omission rate and negative predictive value of the classifier.

(f) (3 marks) A patient tests positive for the illness, and based on the true positive rate for the test,
comes to believe that there is a 75.1% chance that they have the illness. Is this line of reasoning
justified? If so, explain. If not, explain a more suitable line of reasoning, and give the correct
probability.

(g) (2 marks) To further study this illness, a researcher tests people at random until they find 20
people who test positive for the illness. Let X denote the number of people who test negative
before 20 people who test positive are found.
What probability distribution does X follow? Give your answer in the form X ∼ .

5
Extra writing space for Question 2

6
Question 3 6 marks
In this question, you will perform market basket analysis on the following table of transaction data.

Transaction ID Items
1 {carrots}
2 {apples, donuts}
3 {bananas, carrots, donuts, eggs}
4 {apples, donuts, eggs}
5 {bananas}
6 {bananas}
7 {bananas, carrots, donuts, eggs}
8 {apples, bananas, donuts, eggs}
9 {bananas}
10 {bananas, carrots, eggs}
11 {bananas, carrots, donuts, eggs}
12 {apples, bananas, carrots, eggs}

In this question, to save writing time, you may use abbreviations for each item.

(a) (2 marks) Complete the table below to calculate the count and support for the one-item itemsets.

Item Count Support


apples
bananas
carrots
donuts
eggs

(b) (2 marks) The next step of the Apriori algorithm would need to calculate the support of some
two-item itemsets. For a minimum support of 0.4, list the two-item itemsets that the Apriori
algorithm would calculate the support of. Justify your answer.
Read the question statement carefully. You are not being asked to calculate the
support of those itemsets.

7
(c) (2 marks) Calculate the lift of the association rule R given by {donuts} ⇒ {bananas, carrots}.
Indicate if the items have negative or positive impacts on each other.

8
Extra writing space for Question 3

9
Question 4. 6 marks
Imagine that you have been asked to monitor a machine that manufactures small parts. The machine
builds parts one at a time, but on average, 4% of the parts are faulty and they cannot be used. Assume
that each attempt at building a part is independent of any other attempt.

(a) Suppose another company has asked you manufacture 50 of these small parts. These 50 parts
must be working (i.e., not faulty).

(i) (1 mark) In the context of this problem, write a sentence (in plain language) to define a
relevant random variable X with X ∼ NB(50, 0.04). An appropriate sentence might start
with, “Let X be the number of. . . ”.

(ii) (2 marks) What is the expected number of parts in total (faulty plus working) that will be
manufactured after completing the company’s request?

(b) (3 marks) Some time later, you will use the machine to build exactly 120 parts (regardless of
whether they are faulty or not). Write a sentence to define a suitable random variable in the context
of this problem, and then describe the probability distribution that random variable follows using
correct notation for random variables.

10
Extra writing space for Question 4

11
Question 5. 7 marks
Let X ∼ Exp(7). Calculate each of the following.

(a) (2 marks) P (X ≤ 3)

(b) (2 marks) P (X ≥ 2)

(c) (2 marks) P (1 ≤ X < 7)

12
Extra writing space for Question 5

13
Question 6. 8 marks
A random variable X has its probability density function given by

9−3x
 16
 if 1 ≤ x < 3
5x
fX (x) = 4 − 5 if 4 ≤ x ≤ 5

0 otherwise

Its graph is shown below on the interval [0, 6].

1 2 3 4 5

You must use the formula to determine the vertical values.

(a) (1 mark) On the diagram above, shade in the area corresponding to P (X ≤ 45 ).

(b) (2 marks) Based on your answer to (a), determine P (X ≤ 54 ).

14
(c) (3 marks) Explain why fX (x) is a valid probability density function. Show details of any relevant
calculations.

(d) (3 marks) Determine a formula for P (X ≤ x) for values of x such that 1 ≤ x < 3.

15
Extra writing space for Question 6

16

You might also like