0% found this document useful (0 votes)
6 views

Topic4 Revision Session Worksheet Markscheme

The document contains a revision session worksheet with various mathematical problems and their solutions, covering topics such as temperature conversion, statistics, probability, and hypothesis testing. It includes specific questions related to calculating means, medians, and standard deviations, as well as determining probabilities in different scenarios. Each section provides a maximum mark and a detailed mark scheme for grading purposes.

Uploaded by

blisspilatesbaku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Topic4 Revision Session Worksheet Markscheme

The document contains a revision session worksheet with various mathematical problems and their solutions, covering topics such as temperature conversion, statistics, probability, and hypothesis testing. It includes specific questions related to calculating means, medians, and standard deviations, as well as determining probabilities in different scenarios. Each section provides a maximum mark and a detailed mark scheme for grading purposes.

Uploaded by

blisspilatesbaku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

Topic4 (Revision Session Worksheet) [591 marks]

1. [Maximum mark: 6]
The formula F = 1. 8C + 32 is used to convert a temperature in degrees Celsius, C , to degrees Fahrenheit, F .

(a.i) Find a formula for converting a temperature in degrees Fahrenheit to degrees Celsius. [2]

Markscheme

attempt to rearrange to isolate C (M1)

e.g., subtracting 32 or dividing the equation by 1. 8

C =
5
9
(F − 32) (C =
F −32

1.8
, C = 0. 556F − 17. 8) A1

Note: If the answer is not written as an equation, award at most M1A0.

[2 marks]

(a.ii) Find the temperature in degrees Celsius that is recorded as 77 degrees Fahrenheit. [1]

Markscheme

C = (
77−32

1.8
º
=) 25 ( C) A1

[1 mark]

Over one year, the mean daily temperature in Mexico City was calculated to be 17 degrees Celsius with a standard deviation of 9
degrees Celsius.

(b) For the same year, find in degrees Fahrenheit

(b.i) the mean daily temperature in Mexico City. [1]

Markscheme

(1. 8 × 17 + 32 =) 62. 6 ( F) º A1

[1 mark]

(b.ii) the standard deviation of the daily temperature in Mexico City. [2]

Markscheme

recognizing that the ′′+32′′ does not affect the SD (M1)

(1. 8 × 9 =) 16. 2 ( F) º A1

Note: Award M0A0 for 1. 8 × 9 + 32 (= .


48. 2)
[2 marks]

2. [Maximum mark: 7]
The following data show the heights, in metres, of six players in a basketball team.

(a) For these six players, find

(a.i) the mean height. [2]

Markscheme

1. 96 (m) A2

Note: Award A1 for substitution into the formula for the mean

e.g. 1.67+1.60+1.68+…

6
.

[2 marks]

(a.ii) the median height. [1]

Markscheme

1. 94 (m) (1. 935) A1

[1 mark]

(a.iii) the modal height. [1]

Markscheme

2. 31 (m) A1

[1 mark]

(a.iv) the range of the heights. [2]

Markscheme

2. 31 − 1. 60 (M1)

Note: Award M1 for recognizing 2. 31 and 1. 60 as the critical values.

0. 71 (m) A1
[2 marks]

A new player, Gheorghe, joins the team. Their height is measured as 1. 98 metres to the nearest centimetre.

(b) Write down the shortest possible height of Gheorghe. [1]

Markscheme

1. 975 (m) OR 197. 5 (cm) A1

[1 mark]

3. [Maximum mark: 6]
A teacher surveys their students to find out if they have eaten at the local Thai and Indian cafés. The results of the survey are shown
in the following Venn diagram.

(a) Write down the number of students surveyed. [1]

Markscheme

33 A1

[1 mark]

(b) Write down the number of students who have not eaten at the Indian café. [1]

Markscheme

12 A1

[1 mark]

A student is chosen at random from those surveyed.

(c) Find the probability this student has eaten at both the Thai café and the Indian café. [1]

Markscheme
13

33
(0. 394, 0. 393939 … , 39. 4 %) A1

[1 mark]

Let T be the event: a student has eaten at the Thai café.


Let I be the event: a student has eaten at the Indian café.

(d) Find P (T ∪ I) . [1]

Markscheme

(P (T ∪ I ) =)
31
33
(0. 939, 0. 939393 … , 93. 9%) A1

Note: For A1(ft) to be awarded, the numerator must be 31 and the denominator must be their answer to part (a).

[1 mark]

(e) State whether the events T and I are mutually exclusive. Justify your answer. [2]

Markscheme

P(T ∩ I ) ≠ 0 OR n(T ∩ I ) ≠ 0 R1

Note: Accept P(T ) + P(I ) ≠ P(T ∪ I ) provided probabilities are shown.

Accept an equivalent statement in words such as “some (13) students went to both cafes” or “students could go to both cafes”.
Condone P(T and I ) ≠ 0 OR n(T and I ) ≠ 0

no, they are not mutually exclusive A1

Note: Do not award R0A1.

[2 marks]

4. [Maximum mark: 8]
Zac raises funds for a library by running a game where players spin a needle. The final position of the needle results in an outcome
where a player wins or loses money. The outcomes, with associated probabilities, are shown in the following diagram.
Let X represent the amount that a player of this game wins.
(a.i) Find the expected value of X. [2]

Markscheme

use of expected value formula. (M1)

E(X) = 5 × 0. 40 + (−8) × 0. 1 + (−5) × 0. 2 + (−10) × 0. 3

($) − 2. 8 A1

[2 marks]

(a.ii) Interpret your answer to part (a)(i). [1]

Markscheme

Any one of the following A1

on average, players will lose $2. 80 (per game)


players are expected to/are more likely to lose $2. 80 (per game)
this is the long-term expected average when playing the game many times
the expected value/it does not equal 0, so the game is not fair

Do not accept:

players will lose $2. 80 (per game)


players will/are expected to win −$2. 80
on average, players will lose money
players are expected to lose money (per game)
there is more chance of losing money than winning
the game is not fair

[1 mark]

To encourage a person to keep playing this game, Zac increases the winning prize for the second game they play from $5 to $6.
For each successive game they play, the winning prize continues to increase by $1.

Emily plays k games. The k th game is fair.


(b.i) Find the value of k. [4]

Markscheme

E(X) = 0 OR 2.80

0.40
(M1)

EITHER

evidence of increase in winning prize (M1)

5 + (k − 1) × 1 OR (number of price increases =) 7 OR

E(X) for game 1 = −2. 80, E(X) for game 2 = −2. 40 , etc.

(5 + (k − 1)1) × 0. 40 + (−8) × 0. 1 + (−5) × 0. 2 + (−10) × 0. 3 = 0 (A1)

OR (4 + k) × 0. 40 + (−8) × 0. 1 + (−5) × 0. 2 × (−10) × 0. 3 = 0

OR (k =)
2.80
0.40
+ 1

k = 8 (games) A1

OR

(calculation of winnings to make the game fair)

(w × 0. 40 + (−8) × 0. 1 + (−5) × 0. 2 + (−10) × 0. 3 = 0)

(w =) ($) 12 (A1)

evidence of increase in winnings per game up to $12 (M1)

$5, $6, $7, … $12

k = 8 (games) A1

[4 marks]

(b.ii) Explain why Zac expects to raise money from the games Emily plays. [1]

Markscheme

E(X) < 0 for each (any) of the first 7 games (or equivalent) R1

[1 mark]

5. [Maximum mark: 6]
Jerry makes handcrafted chocolates. On average, 1 in 25 of the chocolates that Jerry makes is flawed. Whether or not a chocolate is
flawed is independent of all other chocolates.

(a) In a batch of 20 chocolates, chosen at random, find the probability that

(a.i) two are flawed. [2]

Markscheme

recognition of binomial distribution (condone incorrect parameter) (M1)


e.g. M ~ B(20, 0. 04) OR P(M = 2) = binpdf (20, 0. 04, 2)

= 0. 146 (0. 145799 …) A1

[2 marks]

(a.ii) more than two are flawed. [2]

Markscheme

recognition the cumulative probability required (M1)

e.g. P(M ≥ 3) = 1 − bincdf (20, 0. 04, 2) OR bincdf (20, 0. 04, 3, 20)

= 0. 0439 (0. 0438627 …) A1

[2 marks]

Jerry sells the perfect chocolates for 50 pesos each and the flawed ones for 15 pesos each.

(b) Calculate the expected number of pesos Jerry makes from selling a batch of 20 randomly selected chocolates. [2]

Markscheme

either one of two terms in expected value formula correct (M1)

50(20(0. 96)) + 15(20(0. 04))

= 972 (pesos) A1

[2 marks]

6. [Maximum mark: 7]
The prices, in dollars, of 10 different garden chairs are:

79 139 255 99 50 209 229 193 69 49

(a) Find the range of the prices of the 10 chairs. [2]

Markscheme

identifying the largest and smallest values: ($) 255, ($) 49 (M1)

($) 206 A1

[2 marks]

(b) Use your graphic display calculator to find

(b.i) the mean price of the chairs. [2]

Markscheme
($) 137 (137. 1) (M1)A1

[2 marks]

(b.ii) the standard deviation of the price of the chairs. [1]

Markscheme

($) 74. 5 (74. 4693 …)

Note: The (M1) mark is for correct GDC use and hence can be awarded if either of the values is correct. An answer of
78. 4976 … in (b)(ii) is awarded A0 but is sufficient to credit the (M1).

[1 mark]

In a sale, the price of each of the 10 garden chairs is reduced by $ 20.

(c) Write down

(c.i) the new mean. [1]

Markscheme

(mean=) ($) 117 (117. 1) A1

[1 mark]

(c.ii) the new standard deviation. [1]

Markscheme

(standard deviation =) ($) 74. 5 (74. 4693 …) A1

Note: If their answer to part (c)(ii) is incorrect, it should match their answer to part (b)(ii) to be awarded A1(FT).

[1 mark]

7. [Maximum mark: 5]
Sunita sorts 300 peppers into sizes of small, medium or large. Some peppers are red, some are green, and some are yellow.

The following table shows her results.

Sunita wants to test, at the 5 % significance level, whether the size of the peppers is independent of the colour.
(a) State the null and alternative hypotheses for this test. [1]

Markscheme

H0 : the size of peppers is independent of colour

H1 : the size of peppers is not independent of colour A1

Note: Award A1 for both hypotheses correct. Accept “not associated” in place of independent. Do not accept “correlated” or
“related” or “affected”.

[1 mark]

The critical value for this test is 9. 49.

(b.i) Calculate χ 2
calc
. [2]

Markscheme

χ
2
calc
= 22. 5 (22. 5483 …) A2

[2 marks]

(b.ii) State a conclusion to the test. Give a reason for your answer. [2]

Markscheme

22. 5483 … > 9. 49 OR 0. 000155837 … < 0. 05 R1

(there is sufficient evidence to) reject the null hypothesis A1

Note: Do not award R0A1.

Accept “accept the alternative hypothesis”.

Their conclusion must be consistent with their χ 2


calc
(or p-value) and their hypothesis.

Accept χ 2
calc
> χ
2
crit
or p < sig level provided their χ 2
calc
value or p-value is seen.

[2 marks]

8. [Maximum mark: 7]
Gustav plays a game in which he first tosses an unbiased coin and then rolls an unbiased six-sided die.

If the coin shows tails, the score on the die is Gustav’s final number of points.

If the coin shows heads, one is added to the score on the die for Gustav’s final number of points.

(a) Find the probability that Gustav’s final number of points is 7. [2]
Markscheme

recognizing that only way to score 7 is to achieve a head and a 6 on die (M1)

e.g. 1

6
and 1

2
seen in an attempt to combine probabilities

(
1
6
×
1
2
=)
1
12
(0. 0833333 …) A1

Note: Condone 0. 0835 from the use of 0. 167.

[2 marks]

(b) Complete the following table.

[3]

Markscheme

there are two ways to score (e.g.) 5

achieve a head and a 4 on die, or a tail and a 5 on die (M1)

(2(
1

6
×
1

2
) =)
2

12
(
1

6
, 0. 167, 0. 16666 …) A1

Note: Award these marks for equivalent working for the 2, 3, 4 or 6 point scenarios.

A1

Note: Award A1 for a completely correct table. Award at most (M1)A1A0 if their follow-through answer from part (a) leads to a
total probability not equal to 1.

[3 marks]

(c) Calculate the expected value of Gustav’s final number of points. [2]

Markscheme

EITHER

multiplying at least two columns from their table (M1)

1 1 1 1
1 × + 2 × + … + 6 × + 7 ×
12 6 6 12

OR

recognizing the probabilities in the table are symmetric (M1)


OR

Considering the sum of two random variables (M1)

E(X + Y ) = E(X) + E(Y ) (= 3. 5 + 0. 5)

THEN

(expected value =) 4 A1

Note: Accept 4. 01 (4. 00640 …) from use of their 3 sf values from (b).

Award at most M1A0 if their final answer is not in the range 1 − 7

[2 marks]

9. [Maximum mark: 7]
Billy is a keen walker who keeps a record of his performance. The following table shows the time, in minutes, it takes him to walk
one kilometre up hills with different gradients. The gradient of each hill is constant.

(a.i) Find the equation of the regression line of T on G .

[2]

Markscheme

T = 0. 552G + 6. 36 (= 0. 552139 … G + 6. 35703 …) A1A1

Note: Award A1 for correct values of a and b, A1 for an equation using these correct values.

[2 marks]

(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]

Markscheme

(r =) 0. 994 (= 0. 993910 …) A1

there is a (very) strong positive linear correlation R1

Note: If r is missing award A0R0.

[2 marks]

On Sunday, Billy intends to walk up a hill with a gradient of 13 %.


(b) Estimate the time it will take Billy to walk one kilometre up the hill. [2]

Markscheme

attempt to substitute 13 into their regression equation (M1)

T = 0. 552139 … (13) + 6. 35703 …

13. 5 (mins) (= 13. 5348 …) A1

[2 marks]

This morning, Billy walked one kilometre up a hill, and it took 22 minutes.

(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]

Markscheme

EITHER

using the T on G regression line cannot (always) reliably make a prediction for G R1

OR

equation is for Time on Gradient not Gradient on Time. R1

OR

this estimate is an extrapolation R1

OR

there is no reason to assume this new hill has constant gradient R1

[1 mark]

10. [Maximum mark: 8]


Mrs Whitehouse is a chemistry teacher. After grading her final exams, she creates the following box and whisker diagram to
compare the grades of her two classes.

(a) Identify which two of the following statements must be true according to the box and whisker diagram. Indicate
your choices by placing tick marks in the second column of the following table.
[2]

Markscheme

A1A1

Note: Award A0A0 if three or four statements are selected.

[2 marks]

At the end of the year, Mrs Whitehouse surveyed a random sample of students from each of her two large classes to determine how
satisfied they were with her teaching.

Each student independently selected a value from 1 to 10, with 1 meaning that they were not satisfied at all and 10 meaning that
they were very satisfied.

Her collected data from the student surveys is shown.

Mrs Whitehouse believes that there was no difference in the general satisfaction between the two classes. She assumes that the
data is drawn from a population that can be modelled by a normal distribution and proposes to conduct a t-test at the 5 %
significance level.

(b) Write down the null and alternative hypotheses for her test. [2]

Markscheme

EITHER

H0 : μ1 = μ2 A1
H1 : μ1 ≠ μ2 A1

OR

H0 : μA = μB A1

H1 : μA ≠ μB A1

Note: Accept an equivalent statement in words, but must include reference to “population mean” / “mean for class A and class
B” for the A1 to be awarded.

Do not accept an imprecise “the means are equal”.

[2 marks]

(c) Find the p-value for her test. [2]

Markscheme

p-value = 0. 0952 (0. 0952085 …) A2

[2 marks]

(d) Write down the conclusion to the test. Give a reason for your answer. [2]

Markscheme

0. 0952 > 0. 05 R1

there is insufficient evidence to reject H 0 A1

Note: Do not award R0A1. The answer to part (d) MUST follow through from their hypotheses seen in part (b) and their p-value
seen in part (c); if hypotheses are incorrect/reversed, etc., the answer to part (d) must reflect this in order for the A1 to be
credited.

[2 marks]

11. [Maximum mark: 7]


Rita is playing a game. In the game, she must roll a fair six-sided die. If she gets a five or six then she wins a prize. If not, then she has
another chance but this time she must flip a fair coin which will result in the coin landing on heads or tails. If the coin lands on
heads, then Rita wins a prize.

(a) Complete the tree diagram by writing in the three missing probabilities.
[2]

Markscheme

A1A1

Note: Award A1 for completing first set of branches, A1 for completing second set of branches.

[2 marks]

(b) Find the probability that Rita does not win a prize. [2]

Markscheme

attempt to multiply along the branches (M1)

2 1
×
3 2

=
1
3
(= 0333 …) A1

[2 marks]

(c) Given that Rita won a prize, find the probability that she got a five or six when she rolled the die. [3]

Markscheme

EITHER
1

1
+ (
3

2
×
1
)
M1A1
3 3 2

Note: Award M1 for recognizing conditional probability, A1 for correct substitution.

OR
1

1 −
3

1
M1A1
3
Note: Award M1 for recognizing conditional probability, A1 for correct substitution.

THEN

=
1

2
A1

[3 marks]

12. [Maximum mark: 6]


Nicole works at a local school 5 days each week. She drives an old car to work that has a 72 % probability of starting on any given
morning. The probability of the car starting on a given morning is independent of it starting on any other morning.

(a) Find the probability that Nicole’s car starts on exactly three mornings in a particular 5 day workweek. [2]

Markscheme

evidence of using binomial distribution (M1)

Note: Evidence is X~B(5, 0. 72) or binomial with n = 5, p = 0. 72 .

0. 293 (0. 292626 …) A1

[2 marks]

Nicole walks to work on mornings when her car does not start and it is not raining. Nicole takes the bus to work on mornings when
her car does not start and it is raining.

Where Nicole lives, there is a 42 % probability of rain on any given morning, independent of any other morning. The probability of
Nicole’s car starting is independent of the weather.

(b) Find the probability that Nicole will not have to take the bus in a particular workweek. [4]

Markscheme

attempt to find the probability of taking a bus, (or not taking a bus); (M1)

P (take bus)= 0. 28 × 0. 42 P, (not take bus)= 0. 72 + 0. 28 × 0. 58

0. 1176 or 0. 8824 seen (A1)

EITHER

correct use of binomial distribution with their probability

X~B(5, 0. 1176), X = 0 OR X~B(5, 0. 8824), X = 5 (A1)

OR

(1 − 0. 1176)
5
OR (0. 8824)
5
seen (A1)

THEN

0. 535 (0. 534967 …) A1

[4 marks]
13. [Maximum mark: 6]
Thurston believes that more popular musical artists sell more albums.

He begins to investigate this belief by randomly selecting eight musical artists and collecting data on the number of followers each
of the artists has on a particular social media platform. He then collects data on the number of albums each artist sold in the first
week after releasing an album. His data is shown in Table 1.

Thurston decides to calculate the Spearman’s rank correlation coefficient.

(a) Complete the table of ranks shown in Table 2.

Table 2

[1]

Markscheme

A1

[1 mark]

(b) Calculate the value of r , Spearman’s rank correlation coefficient.


s [2]

Markscheme

(r s =) 0. 595 (0. 595238 …) A2


[2 marks]

Thurston believes that artists with a higher number of social media followers sell more albums in the first week. He carries out a
hypothesis test using a 10 % significance level with the following null hypothesis:

H0 : In the population, there is no monotonic relationship between the number of social media followers and the number of
albums sold in the first week.

(c) Write down Thurston’s alternative hypothesis. [1]

Markscheme

In the population, there is a positive monotonic relationship between the number of social media followers and the
(H 1 :)

number of albums sold in the first week. A1

[1 mark]

The critical value of r for this test is 0. 643.


s

(d) State the conclusion of the hypothesis test, giving a reason. [2]

Markscheme

0. 595 < 0. 643 R1

there is insufficient evidence to reject H 0 A1

Note: Do not award R0A1.

[2 marks]

14. [Maximum mark: 7]


Joel is a keen cyclist who keeps a record of his performance. The following table shows the time, in minutes, it takes him to ride one
kilometre on hills with different gradients. The gradient of each hill is constant.

(a.i) Find the equation of the regression line of T on G. [2]

Markscheme

T = 0. 799G + 2. 14 (= 0. 798803 … G + 2. 13972 …) A1A1

Note: Award A1 for correct values of a and b, A1 for an equation using these correct values.

[2 marks]

(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]
Markscheme

(r =) 0. 996 (= 0. 996247 …) A1

(there is a very) strong positive linear correlation R1

Note: If r is missing award A0R0.

[2 marks]

On Saturday, Joel intends to ride a hill with a gradient of 17 %.

(b) Estimate the time it will take Joel to ride one kilometre up the hill. [2]

Markscheme

attempt to substitute 17 into their regression equation (M1)

0. 798803 … (17) + 2. 13972 …

15. 7 (mins) (= 15. 7193 …) A1

[2 marks]

This morning, Joel rode one kilometre up a hill, and it took 22 minutes.

(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]

Markscheme

EITHER

using the T on G regression line cannot (always) reliably make a prediction for G R1

OR

equation is for Time on Gradient, not Gradient on Time R1

OR

this estimate is an extrapolation R1

OR

there is no reason to assume this new hill has constant gradient R1

[1 mark]

15. [Maximum mark: 6]


The decathlon is a competition where athletes compete in ten events. Two of those events are long jump and high jump. In both
events, a greater distance means a better ranking.

The table shows results for these two events at the World Championships.
Event Rank
Athlete’s Long Jump High Jump Long Jump High Jump
Country (m) (m) Rank Rank
Germany 7. 64 2. 11 1

France 7. 52 2. 08 2

Estonia 7. 49 1. 84 3

Canada 7. 44 2. 02 4

Netherlands 7. 33 2. 05 5

Ukraine 7. 28 2. 02 6

Algeria 7. 22 1. 90 7

Austria 7. 11 1. 87 8

Grenada 6. 98 1. 99 9

Japan 6. 64 1. 96 10

The Spearman’s rank correlation coefficient is used to determine if there is a linear correlation between an athlete’s ranking in long
jump and their ranking in high jump.
(a) Complete the table to show the athletes’ rankings in high jump. [2]

Markscheme

Event Rank
Athlete’s Long Jump High Jump Long Jump High Jump
Country (m) (m) Rank Rank
Germany 7. 64 2. 11 1 1

France 7. 52 2. 08 2 2

Estonia 7. 49 1. 84 3 10

Canada 7. 44 2. 02 4 4. 5

Netherlands 7. 33 2. 05 5 3

Ukraine 7. 28 2. 02 6 4. 5

Algeria 7. 22 1. 90 7 8

Austria 7. 11 1. 87 8 9

Grenada 6. 98 1. 99 9 6

Japan 6. 64 1. 96 10 7

A1A1

Note: Award A1 for ranking of tied heights, A1 for correct ranking of non-tied heights.

[2 marks]

(b) Find the value of the Spearman’s rank correlation coefficient r .


s [2]

Markscheme

(r s =) 0. 541 (0. 541035 …) A2


Note: Award A2 for an answer of 0. 539 (0. 539393 …) from use of the formula for Spearman’s rank correlation coefficient
when data has tied ranks.

[2 marks]

The following guide is used by the coach to determine the strength of the correlation between the ranks for long jump and high
jump.

|r s | Strength
0. 000 to 0. 199 Very weak
0. 200 to 0. 399 Weak
0. 400 to 0. 599 Moderate
0. 600 to 0. 799 Strong
0. 800 to 1. 000 Very strong

(c) State the strength of the correlation between the rankings as indicated by the table and interpret this in the
context of the question. [2]

Markscheme

moderate (correlation) A1

as long jump ranking increases, high jump ranking will (likely) increase A1

[2 marks]

16. [Maximum mark: 6]


Carys believes that, on a memory retention test, the mean score of bilingual people (μ ) will be higher than the mean score of
b

monolingual people (μ ). Carys gave a memory retention test to a random sample of students in her class. The results are shown in
m

the two tables.

Carys performs a one-tailed t-test at a 5% level of significance. It is assumed that the scores are normally distributed and the
samples have equal variances.

(a) State the null and alternative hypotheses. [2]


Markscheme

H0 : μb = um A1

H1 : μb > um A1

Note: Accept equivalent statements in words such as “the mean score of bilingual people equals the mean score of
monolingual people”.

[2 marks]

(b) Calculate the p-value for this test. [2]

Markscheme

0. 119 (0. 119395 …) A2

[2 marks]

(c) State the conclusion of the test in the context of the question. Justify your answer. [2]

Markscheme

0. 119395 … > 0. 05 (11. 9395 … % > 5%) R1

(fail to reject H ) there is insufficient evidence to suggest that bilingual people have better memory retention than
0

monolingual people A1

Note: Do not award R0A1.

The answer to part (c) MUST be consistent with their hypotheses and their p-value.

[2 marks]

17. [Maximum mark: 7]


On a specific day, the speed of cars as they pass a speed camera can be modelled by a normal distribution with a mean of
67. 3 km h . −1

A speed of 75. 7 km h −1
is two standard deviations from the mean.

(a) Find the standard deviation for the speed of the cars. [2]

Markscheme

attempt to find the difference between 75. 7 and 67. 3 (M1)


75.7−67.3

4. 2 (km h
−1
) A1

[2 marks]

Speeding tickets are issued to all drivers travelling at a speed greater than 72 km h −1
.

(b) Find the probability that a randomly selected driver who passes the speed camera receives a speeding ticket. [2]

Markscheme

recognition of normal distribution that includes 72 (M1)

e.g., sketch of normal distribution curve with 72 labelled to the right of the mean OR Normal CDF calculation using 72

0. 132 (0. 131559 … , 13. 2%, 13. 1559 … %) A1

[2 marks]

It is found that 82% of cars on this road travel at speeds between p km h −1


and q km h
−1
, where p < q . This interval includes
cars travelling at a speed of 74 km h . −1

(c) Show that the region of the normal distribution between p and q is not symmetrical about the mean. [3]

Markscheme

METHOD 1 (Comparing areas above and below the mean)

(
P 67. 3 <speed< 74) OR Normal CDF(67. 3, 74, 67. 3, 4. 2 ) OR sketch of normal distribution with 67. 3 and 74 labelled
and shaded between (M1)

area of region between mean and q is at least 0. 445 (0. 444670 …) A1

Hence no more than 0. 375 (0. 375329 …) between mean and p R1

The region between p and q is not symmetrical AG

METHOD 2 (Comparing areas in the tails)

attempt to calculate probability that speed < p and speed> q with q = 74 (M1)

P (speed< )
74 = 0. 944670 …

P (speed< p)= (0. 944670 … − 0. 82 =) 0. 124670 …

P (speed> q)= (1 − 0. 944670 … =) 0. 0553295 … A1

if q ≥ 74 , then P(speed> q)≤ 0. 0553295 and P(speed< p)≥ 0. 124670 so

P (speed> q) will never equal P(speed< p) R1


the region between p and q is not symmetrical AG

METHOD 3 (Assumption of symmetry comparing speeds)

attempt to calculate area below q assuming distribution is symmetrical (M1)

e.g. P(speed< q)= 0. 82 + 1


/2 × 0. 18 (0. 91)

EITHER

(q =) 72. 9 (72. 9311 …) A1

72. 9 < 74 so 74 would not be in the region R1

the region between p and q is not symmetrical AG

OR

P (speed< )
74 = 0. 945 (0. 944670 …) A1

0. 945 > 0. 91 so 74 would not be in the region R1

the region between p and q is not symmetrical AG

METHOD 4 (Assumption of symmetry comparing areas)

attempt to calculate symmetrical area with 74 as a boundary (M1)

(
P 60. 6 < speed< 74 ) OR Normal CDF(60. 6, 74, 67. 3, 4. 2 ) OR

(
P 67. 3 < speed< 74 ) OR Normal CDF(67.3, 74, 67.3, 4.2)

EITHER

0. 889 (0. 889340 …) A1

0. 889 > 0. 82 so 74 would not be in the region R1

the region between p and q is not symmetrical AG

OR

0. 445 (0. 444670 …) A1

0. 4459 > 0. 82 ÷ 2 so 74 would not be in the region R1

the region between p and q is not symmetrical AG

[3 marks]
18. [Maximum mark: 7]
In a school, 200 students solved a problem in a mathematics competition. Their times to solve the problem were recorded and the
following cumulative frequency graph was produced.

(a) Use the graph to find

(a.i) the median time; [1]

Markscheme

38 (s) A1

Note: Accept a tolerance of ±0. 5 for parts (a)(i)-(iii).

[1 mark]

(a.ii) the lower quartile; [1]

Markscheme

32 (s) A1

Note: Accept a tolerance of ±0. 5 for parts (a)(i)-(iii).

[1 mark]
(a.iii) the upper quartile; [1]

Markscheme

42 (s) A1

Note: Accept a tolerance of ±0. 5 for parts (a)(i)-(iii).

[1 mark]

(a.iv) the interquartile range. [1]

Markscheme

10 (s) A1

Note: Accept a tolerance of ±0. 5 for parts (a)(i)-(iii).

[1 mark]

Cedric took 14 seconds to solve the problem.

(b) Determine whether Cedric’s time is an outlier. [3]

Markscheme

1. 5× IQR (M1)

(32 − 1. 5 × 10 =) 17 (s) A1

14 < 17 , therefore it is an outlier R1

Note: Do not award the R1 unless an explicit comparison of 14 and their 17 is seen.

e.g. 14 < 17

14 is outside the interval [17, 57] .

[3 marks]

19. [Maximum mark: 6]


At a running club, Sung-Jin conducts a test to determine if there is any association between an athlete’s age and their best time
taken to run 100 m. Eight athletes are chosen at random, and their details are shown below.
Athlete A B C D E F G H

Age (years) 13 17 22 18 19 25 11 36

Time (seconds) 13. 4 14. 6 13. 4 12. 9 12. 0 11. 8 17. 0 13. 1

Sung-Jin decides to calculate the Spearman’s rank correlation coefficient for his set of data.
(a) Complete the table of ranks.

Athlete A B C D E F G H

Age rank 3
[2]
Time rank 1

Markscheme

Athlete A B C D E F G H

Age rank 7 6 3 5 4 2 8 1

Time rank 3. 5 2 3. 5 6 7 8 1 5

A1A1

Note: Award A1 for each correct row.

[2 marks]

(b) Calculate the Spearman’s rank correlation coefficient, r . s [2]

Markscheme

r s = −0. 671 (−0. 670670 …) A2

Note: Only follow through from an incorrect table provided the ranks are all between 1 and 8.

Award A1 for −0. 67 OR for the omission of the negative sign, e.g. 0. 671 (0. 670670 …) or 0. 67

[2 marks]

(c) Interpret this value of r in the context of the question.


s [1]

Markscheme

(A value of r s = −0. 671 ) indicates a negative correlation between a person’s age and the best time they take to run 100 m.
R1
Note: Condone any comment that includes “weak” or “strong” etc. Accept an interpretation in words, but only if there is a
general link described and not a rule: “The older a person gets, the faster they tend to run”. Answer must be in context.

[1 mark]

(d) Suggest a mathematical reason why Sung-Jin may have decided not to use Pearson’s product-moment
correlation coefficient with his data from the original table. [1]

Markscheme

Award R1 for any sensible reason: R1

The correlation, such that it is, is unlikely to be linear for this type of data.

Spearman’s CC is less sensitive to outliers

Sung-Jin is not sure the data is drawn from a bivariate normal distribution

There are outliers/extreme data

Same time for two athletes with significantly different ages

[1 mark]

20. [Maximum mark: 4]


The following frequency distribution table shows the test grades for a group of students.

Grade 1 2 3 4 5 6 7

Frequency 1 4 7 9 p 9 4

For this distribution, the mean grade is 4. 5.

(a) Write down the total number of students in terms of p. [1]

Markscheme

34 + p A1

[1 mark]

(b) Calculate the value of p. [3]

Markscheme

attempt to substitute into the mean formula, equating to 4. 5 (M1)


A1
1×1+2×4…5×p+6×9+7×4
= 4. 5
34+p

(p =) 10 A1

Note: Do not award the final A1 if final answer is not an integer.

Award (M1)A0A1 for an unsupported answer of (p =) 10 .

[3 marks]

21. [Maximum mark: 6]


A company that owns many restaurants wants to determine if there are differences in the quality of the food cooked for three
different meals: breakfast, lunch and dinner.

Their quality assurance team randomly selects 500 items of food to inspect. The quality of this food is classified as perfect,
satisfactory, or poor. The data is summarized in the following table.

An item of food is chosen at random from these 500.

(a) Find the probability that its quality is not perfect, given that it is from breakfast. [2]

Markscheme

0. 565 (0. 564655 … ,


131

232
, 56. 4655 … %) A1A1

Note: Award A1 for correct numerator, A1 for correct denominator.

[2 marks]

A χ test at the 5% significance level is carried out to determine if there is significant evidence of a difference in the quality of the
2

food cooked for the three meals.

The critical value for this test is 9. 488.

The hypotheses for this test are:


H0 : The quality of the food and the type of meal are independent.
H1 : The quality of the food and the type of meal are not independent.
(b) Find the χ statistic.
2
[2]

Markscheme

11. 0 (11. 0212 …) A2

Note: Award A1 for a final answer of 11 if no unrounded answer is seen.

[2 marks]

(c) State, with justification, the conclusion for this test. [2]

Markscheme

EITHER

11. 0 > 9. 88 (11. 0212 … > 9. 488) R1

OR

0. 0263 < 0. 05 (0. 0263264 … < 0. 05) R1

THEN

EITHER

(there is significant evidence to) reject H 0 A1

OR

(there is significant evidence that) the (food) quality and the type of meal are not independent A1

Note: Do not award R0A1.

Award R1 for χ 2
calc
> χ
2
crit
, provided the calculated value is explicitly seen in part (b).

Accept “p-value < significance level” provided their p-value is seen and their p-value is between 0 and 1.

[2 marks]

22. [Maximum mark: 6]


The lengths of the seeds from a particular mango tree are approximated by a normal distribution with a mean of 4 cm and a
standard deviation of 0. 25 cm.

A seed from this mango tree is chosen at random.

(a) Calculate the probability that the length of the seed is less than 3. 7 cm. [2]
Markscheme

2
X~N(4, 0. 25 )

EITHER

correct probability expression (M1)

P(X < 3. 7)

Note: Accept a weak or strict inequality, and any label instead of X, e.g. length or L.

OR

normal curve with vertical line, left of mean, labelled 3. 7, and shaded region (M1)

THEN

0. 115 (0. 115069 … , 11. 5%) A1

Note: Award M1A0 for 0. 12 if no previous working.

[2 marks]

It is known that 30% of the seeds have a length greater than k cm.

(b) Find the value of k. [2]

Markscheme

EITHER

Correct probability expression (M1)

P(X < k) = 0. 7 OR P(X > k) = 0. 3

Note: Accept a weak or strict inequality, and any label instead of X e.g., length or L.
OR

normal curve with vertical line to the right of the mean and shaded region, correctly labelled either 0. 3 or 0. 7 (M1)

THEN

(k =) 4. 13 (4. 13110 …) A1

Note: Award M1A0 for 4. 1 if no previous working.

[2 marks]

For a seed of length d cm, chosen at random, P(4 − m < d < 4 + m) = 0. 6 .

(c) Find the value of m. [2]

Markscheme

EITHER

correct probability equation (M1)

P(length < 4 + m) = 0. 8 OR P(length < 4 − m) = 0. 2

Note: Accept any letter instead of “length” e.g., X or L.

OR

normal curve with vertical lines symmetrical about the mean line with a correct indication of an area of 0. 6 or 0. 2 or 0. 8
(M1)

THEN

0. 210 (0. 210405 …) A1


Note: Award (M1)A0 for an answer of 3. 7895 or 4. 2105 seen without working. Condone 0. 21 seen and award (M1)A1.

[2 marks]

23. [Maximum mark: 5]


In a game, balls are thrown to hit a target. The random variable X is the number of times the target is hit in five attempts. The
probability distribution for X is shown in the following table.

x 0 1 2 3 4 5

P(X = x) 0. 15 0. 2 k 0. 16 2k 0. 25

(a) Find the value of k. [2]

Markscheme

0. 15 + 0. 2 + k + 0. 16 + 2k + 0. 25 = 1 (M1)

k = 0. 08 A1

[2 marks]

The player has a chance to win money based on how many times they hit the target.

The gain for the player, in $, is shown in the following table, where a negative gain means that the player loses money.

x 0 1 2 3 4 5

Player’s gain ($) −4 −3 −1 0 1 4

(b) Determine whether this game is fair. Justify your answer. [3]

Markscheme

(−4 × 0. 15) + (−3 × 0. 2) + (−1 × 0. 08) + (0 × 0. 16) + (1 × 0. 16) + (4 × 0. 25) (M1)

= −0. 12 A1

E(X) ≠ 0 therefore the game is not fair R1


Note: Do not award A0R1 without an explicit value for E(X) seen. The R1 can be awarded for comparing their E(X) to zero
provided working is shown.

[3 marks]

24. [Maximum mark: 7]


The following Venn diagram shows two independent events, R and S . The values in the diagram represent probabilities.

(a) Find the value of x. [3]

Markscheme

attempting to use P(R ∩ S) = P(R)P(S) (M1)

0. 2 = 0. 8(0. 2 + x) (A1)

x = 0. 05 A1

[3 marks]

(b) Find the value of y. [2]

Markscheme

x + 0. 2 + 0. 6 + y = 1 (M1)

y = 0. 15 A1

[2 marks]

(c) Find P(R′|S′). [2]

Markscheme
25.
METHOD 1

attempting to apply P(R′

(a)
0.15

0.2

METHOD 2

P(R′|S′) = P(R′)

Markscheme

P(T < 55)

[2 marks]
A1

= 1 − 0. 25 = 0. 75

∣S′) =
P(R′∩S′)

P(S′)
(M1)

(because R, S are independent)

A1

Note: FT from their values of x or y.

[2 marks]

[Maximum mark: 5]
(M1)

Roy is a member of a motorsport club and regularly drives around the Port Campbell racetrack.

The times he takes to complete a lap are normally distributed with mean 59 seconds and standard deviation 3 seconds.

Find the probability that Roy completes a lap in less than 55 seconds.

0. 0912 (0. 0912112 …)


(M1)

A1

Note: Award M1 for a correct calculator notation such as normal cdf (0, 55, 59, 3) or normal cdf (−1

Roy will complete a 20 lap race. It is expected that 8. 6 of the laps will take more than t seconds.

(b)

Markscheme
Find the value of t.

correct use of expected value

8. 6 = 20 × p OR (p =) 0. 43 seen (M1)
99
, 55, 59, 3) .
[2]

[3]
EITHER

correct probability statement

P(T > t) = 0. 43 OR P (T < t) = 0. 57 (M1)

OR

t indicated on sketch to communicate correct area (M1)

THEN

(t =) 59. 5 (seconds) (59. 5291 …) A1

[3 marks]

26. [Maximum mark: 7]


Taizo plays a game where he throws one ball at two bottles that are sitting on a table. The probability of knocking over bottles, in
any given game, is shown in the following table.

(a) Taizo plays two games that are independent of each other. Find the probability that Taizo knocks over a total of
two bottles. [4]

Markscheme

0. 5 × 0. 1 + 0. 4 × 0. 4 + 0. 1 × 0. 5 (M1)(M1)(M1)

Note: Award M1 for 0. 5 × 0. 1 or 0. 1 × 0. 5, M1 for 0. 4 × 0. 4, M1 for adding three correct products.

0. 26 A1

[4 marks]

In any given game, Taizo will win k points if he knocks over two bottles, win 4 points if he knocks over one bottle and lose 8 points
if no bottles are knocked over.
(b) Find the value of k such that the game is fair. [3]

Markscheme

0 = −8 × 0. 5 + 4 × 0. 4 + 0. 1k (M1)(M1)

Note: Award M1 for correct substitution into the formula for expected value, award M1 for the expected value formula equated
to zero.

(k =) 24 (points) A1

[3 marks]

27. [Maximum mark: 5]


Sergio is interested in whether an adult’s favourite breakfast berry depends on their income level. He obtains the following data for
341 adults and decides to carry out a χ test for independence, at the 10% significance level.
2

(a) Write down the null hypothesis. [1]

Markscheme

The favourite breakfast/berry (of adults) is independent of (their) income (level). A1

[1 mark]

(b) Find the value of the χ statistic.


2
[2]

Markscheme

χ
2
= 2. 27 (2. 26821 …) A2

[2 marks]

The critical value of this χ test is 7. 78.


2

(c) Write down Sergio’s conclusion to the test in context. Justify your answer. [2]
Markscheme

EITHER

2. 27 < 7. 78 OR 2. 27 < critical value R1

OR

0. 687 > 0. 1 (using p-value)

THEN

(Do not reject H )


0

Insufficient evidence (at the 10% significance level) that the favourite berry depends on income level. A1

Note: Do not award R0A1. Accept “χ ” in place of their “2. 27”, provided an answer was seen in part (b). Their conclusion must
2

be consistent with their χ (or a correct p-value) and their hypothesis.


2

[2 marks]

28. [Maximum mark: 5]


Manny and Annabelle, mathematics teachers at Burnham High School, give their students the same examination. A random
sample of the examination scores were collected from each of their classes.

Annabelle uses these scores to conduct a two-tailed t-test to compare the means of the two classes, at the 5% level of significance.
It is assumed the examination scores for both classes have the same variance and are normally distributed.

The null hypothesis is μ 1 = μ2 , where μ is the mean examination score from Manny’s class and μ is the mean examination score
1 2

from Annabelle’s class.

(a) Write down the alternative hypothesis. [1]

Markscheme

(H 1 :) μ 1 ≠ μ 2 A1

Note: Accept an equivalent statement in words referring to μ and μ as defined in the question.
1 2

[1 mark]

(b) Find the p-value for this test. Give your answer correct to five decimal places. [2]
Markscheme

0. 97652 (0. 976516 …) A2

[2 marks]

Annabelle concludes there is insufficient evidence to reject the null hypothesis.

(c) State whether Annabelle’s conclusion is correct. Give a reason for your answer. [2]

Markscheme

0. 97652 > 0. 05 (0. 977 > 0. 05) R1

Annabelle’s conclusion is correct. A1

Note: Do not award R0A1. Answer must reference Annabelle’s conclusion; do not accept an answer, without context, of “fail to
reject H ” for the A1 mark.
0

[2 marks]

29. [Maximum mark: 7]


In the first month of a reforestation program, the town of Neerim plants 85 trees. Each subsequent month the number of trees
planted will increase by an additional 30 trees.

The number of trees to be planted in each of the first three months are shown in the following table.

(a) Find the number of trees to be planted in the 15th month. [3]

Markscheme

use of the n th
term of an arithmetic sequence formula (M1)

u 15 = 85 + (15 − 1) × 30 (A1)

505 A1

[3 marks]

(b) Find the total number of trees to be planted in the first 15 months. [2]
Markscheme

use of the sum of n terms of an arithmetic sequence formula (M1)

S 15 =
15

2
(85 + 505) OR 15

2
(2 × 85 + (15 − 1) × 30)

4430 (4425) A1

[2 marks]

(c) Find the mean number of trees planted per month during the first 15 months. [2]

Markscheme

4425

15
OR 85 + (8 − 1) × 30 (M1)

295 A1

Note: Accept 295. 333 … from use of 3sf value from part (b).

[2 marks]

30. [Maximum mark: 6]


A factory produces bags of sugar with a labelled weight of 500 g. The weights of the bags are normally distributed with a mean of
500 g and a standard deviation of 3 g.

(a) Write down the percentage of bags that weigh more than 500 g. [1]

Markscheme

50% A1

Note: Do not accept 0. 5 or 1

2
.

[1 mark]

A bag that weighs less than 495 g is rejected by the factory for being underweight.

(b) Find the probability that a randomly chosen bag is rejected for being underweight. [2]

Markscheme

0. 0478 (0. 0477903 … , 4. 78%) A2


[2 marks]

(c) A bag that weighs more than k grams is rejected by the factory for being overweight. The factory rejects 2% of
bags for being overweight.

Find the value of k. [3]

Markscheme

P(X < k) = 0. 98 OR P(X > k) = 0. 02 (M1)

Note: Award (M1) for a sketch with correct region identified.

506 g (506. 161 …) A2

[3 marks]

31. [Maximum mark: 7]


Leo is investigating whether a six-sided die is fair. He rolls the die 60 times and records the observed frequencies in the following
table:

Leo carries out a χ goodness of fit test at a 5% significance level.


2

(a) Write down the null and alternative hypotheses. [1]

Markscheme

H0 : The die is fair OR P(any number) =


1

6
OR probabilities are equal

H1 : The die is not fair OR P(any number) ≠


1

6
OR probabilities are not equal A1

[1 mark]

(b) Write down the degrees of freedom. [1]

Markscheme

5 A1

[1 mark]
(c) Write down the expected frequency of rolling a 1. [1]

Markscheme

10 A1

[1 mark]

(d) Find the p-value for the test. [2]

Markscheme

(p-value =) 0. 287 (0. 28724163 …) A2

[2 marks]

(e) State the conclusion of the test. Give a reason for your answer. [2]

Markscheme

0. 287 > 0. 05 R1

EITHER

Insufficient evidence to reject the null hypothesis A1

OR

Insufficient evidence to reject that the die is fair A1

Note: Do not award R0A1. Condone “accept the null hypothesis” or “the die is fair”. Their conclusion must be consistent with their
p-value and their hypothesis.

[2 marks]

32. [Maximum mark: 6]


Karl has three brown socks and four black socks in his drawer. He takes two socks at random from the drawer.

(a) Complete the tree diagram.


[1]

Markscheme

A1

Note: Award A1 for both missing probabilities correct.

[1 mark]

(b) Find the probability that Karl takes two socks of the same colour. [2]

Markscheme

multiplying along branches and then adding outcomes (M1)

3 2 4 3
× + ×
7 6 7 6

=
18

42
(=
3

7
≈ 0. 429 (42. 9%)) A1

[2 marks]

(c) Given that Karl has two socks of the same colour find the probability that he has two brown socks. [3]

Markscheme

use of conditional probability formula M1


3 2
( × )
7

(
3
6

)
A1
7
=
18
6
(=
1

3
) (
252

756
, 0. 333, 33. 3%) A1

[3 marks]

33. [Maximum mark: 6]


A study was conducted to investigate whether the mean reaction time of drivers who are talking on mobile phones is the same as
the mean reaction time of drivers who are talking to passengers in the vehicle. Two independent groups were randomly selected
for the study.

To gather data, each driver was put in a car simulator and asked to either talk on a mobile phone or talk to a passenger. Each driver
was instructed to apply the brakes as soon as they saw a red light appear in front of the car. The reaction times of the drivers, in
seconds, were recorded, as shown in the following table.

At the 10% level of significance, a t-test was used to compare the mean reaction times of the two groups. Each data set is assumed
to be normally distributed, and the population variances are assumed to be the same.

Let μ and μ be the population means for the two groups. The null hypothesis for this test is H
1 2 0 : μ1 − μ2 = 0 .

(a) State the alternative hypothesis. [1]

Markscheme

(H 1 :) μ 1 − μ 2 ≠ 0 ( μ1 ≠ μ2 ) A1

Note: Accept an equivalent statement in words, however reference to “population mean” must be explicit for A1 to be
awarded.

[1 mark]

(b) Calculate the p-value for this test. [2]

Markscheme

0. 0778 (0. 0778465 …) A2

Note: Award A1 for an answer of 0. 0815486 … from not using a pooled estimate of the variance.
[2 marks]

(c.i) State the conclusion of the test. Justify your answer. [2]

Markscheme

0. 0778 < 0. 1 R1

reject the null hypothesis A1

Note: Do not award R0A1.

[2 marks]

(c.ii) State what your conclusion means in context. [1]

Markscheme

there is (significant evidence of ) a difference between the (population) mean reaction times A1

Note: Their conclusion in (c)(ii) must match their conclusion in (c)(i) to earn A1. Award A0 if their conclusion refers to mean
reaction times in the sample.

[1 mark]

34. [Maximum mark: 5]


A college runs a mathematics course in the morning. Scores for a test from this class are shown below.

25 33 51 62 63 63 70 74 79 79 81 88 90 90 98

For these data, the lower quartile is 62 and the upper quartile is 88.

(a) Show that the test score of 25 would not be considered an outlier. [3]

Markscheme

(88 − 62) × 1. 5 OR 26 × 1. 5 seen anywhere OR 39 seen anywhere (M1)

62 − 39

23 A1

25 > 23 R1

so is not an outlier AG
[3 marks]

The box and whisker diagram showing these scores is given below.

Test scores

Another mathematics class is run by the college during the evening. A box and whisker diagram showing the scores from this class
for the same test is given below.

Test scores

A researcher reviews the box and whisker diagrams and believes that the evening class performed better than the morning class.

(b) With reference to the box and whisker diagrams, state one aspect that may support the researcher’s opinion and
one aspect that may counter it. [2]

Markscheme

The median score for the evening class is higher than the median score for the morning class. A1

THEN

but the scores are more spread out in the evening class than in the morning class A1

OR

the scores are more inconsistent in the evening class A1

OR

the lowest scores are in the evening class A1

OR

the interquartile range is lower in the morning class A1

OR

the lower quartile is lower in the evening class A1

Note: If an incorrect comparison is also made, award at most A1A0.


Award A0 for a comparison that references “the mean score” unless working is shown for the estimated means of the data sets,
calculated from the mid-points of the 4 intervals. The estimated mean for the morning class is 71. 375 and the estimated
mean for the evening class is 70. 5.

[2 marks]

35. [Maximum mark: 6]


A group of 130 applicants applied for admission into either the Arts programme or the Sciences programme at a university. The
outcomes of their applications are shown in the following table.

(a) Find the probability that a randomly chosen applicant from this group was accepted by the university. [1]

Markscheme

(
17+25

130
=)
42

130
(
21

65
, 0. 323076 …) A1

[1 mark]

An applicant is chosen at random from this group. It is found that they were accepted into the programme of their choice.

(b) Find the probability that the applicant applied for the Arts programme. [2]

Markscheme

(
17

17+25
=)
17

42
(0. 404761 …) A1A1

Note: Award A1 for correct numerator and A1 for correct denominator.


17/130
Award A1A0 for working of their answer to (a)
if followed by an incorrect answer.

[2 marks]

(c) Two different applicants are chosen at random from the original group.

Find the probability that both applicants applied to the Arts programme. [3]

Markscheme

41

130
×
40

129
A1M1
Note: Award A1 for two correct fractions seen, M1 for multiplying their fractions.

=
1640

16770
≈ 0. 0978 (0. 0977936 … ,
164

1677
) A1

[3 marks]

36. [Maximum mark: 7]


A polygraph test is used to determine whether people are telling the truth or not, but it is not completely accurate. When a person
tells the truth, they have a 20% chance of failing the test. Each test outcome is independent of any previous test outcome.

10 people take a polygraph test and all 10 tell the truth.

(a) Calculate the expected number of people who will pass this polygraph test. [2]

Markscheme

(E(X) =) 10 × 0. 8 (M1)

8 (people) A1

[2 marks]

(b) Calculate the probability that exactly 4 people will fail this polygraph test. [2]

Markscheme

recognition of binomial probability (M1)

0. 0881 (0. 0880803 …) A1

[2 marks]

(c) Determine the probability that fewer than 7 people will pass this polygraph test. [3]

Markscheme

0. 8 and 6 seen OR 0. 2 and 3 seen (A1)

attempt to use binomial probability (M1)

0. 121 (0. 120873 …) A1

[3 marks]
37. [Maximum mark: 5]
The masses of Fuji apples are normally distributed with a mean of 163 g and a standard deviation of 6. 83 g.

When Fuji apples are picked, they are classified as small, medium, large or extra large depending on their mass. Large apples have a
mass of between 172 g and 183 g.

(a) Determine the probability that a Fuji apple selected at random will be a large apple. [2]

Markscheme

sketch of normal curve with shaded region to the right of the mean and correct values (M1)

0. 0921 (0. 0920950 …) A1

[2 marks]

Approximately 68% of Fuji apples have a mass within the medium-sized category, which is between k and 172 g.

(b) Find the value of k. [3]

Markscheme

EITHER

(P(x < 172))

0. 906200 … (A1)

(0. 906200 … − 0. 68)

0. 226200 … (A1)

OR

(P(163 < x < 172))

0. 406200 … (A1)

0. 5 − (0. 68 − 0. 406200 …) OR 0. 5 + (0. 68 − 0. 406200 …)

0. 226200 … OR 0. 773799 … (A1)

OR
(A1)(A1)

Note: Award A1 for a normal distribution curve with a vertical line on each side of the mean and a correct probability of either
0. 406 or 0. 274 or 0. 906 shown, A1 for a probability of 0. 226 seen.

THEN

(k =) 158 g (157. 867 … g) A1

[3 marks]

38. [Maximum mark: 5]


In a city, 32% of people have blue eyes. If someone has blue eyes, the probability that they also have fair hair is 58%. This
information is represented in the following tree diagram.

(a) Write down the value of a. [1]

Markscheme

a = 0. 42 A1

[1 mark]

(b) Find an expression, in terms of b, for the probability of a person not having blue eyes and having fair hair. [1]

Markscheme

(P(B′∩F ) =) b × 0. 68 A1
[1 mark]

It is known that 41% of people in this city have fair hair.

Calculate the value of

(c.i) .
b [2]

Markscheme

0. 32 × 0. 58 + 0. 68b = 0. 41 (M1)

Note: Award (M1) for setting up equation for fair-haired or equivalent.

b = 0. 33 A1

[2 marks]

(c.ii) c . [1]

Markscheme

c = 0. 67 A1

[1 mark]

39. [Maximum mark: 6]


Eduardo believes that there is a linear relationship between the age of a male runner and the time it takes them to run 5000
metres.

To test this, he recorded the age, x years, and the time, t minutes, for eight males in a single 5000 m race. His results are presented
in the following table and scatter diagram.
(a) For this data, find the value of the Pearson’s product-moment correlation coefficient, r. [2]

Markscheme

r = 0. 933 (0. 933419 …) A2

[2 marks]

Eduardo looked in a sports science text book. He found that the following information about r was appropriate for athletic
performance.

(b) Comment on your answer to part (a), using the information that Eduardo found. [1]

Markscheme

strong A1

Note: Answer may include “positive”, however this is not necessary for the mark.

[1 mark]

(c) Write down the equation of the regression line of t on x, in the form t = ax + b. [1]

Markscheme

t = 0. 228x + 24. 3 (t = 0. 227703 … x + 24. 3153 …) A1


Note: Condone y in place of t. Answer must be an equation.

[1 mark]

(d) A 57-year-old male also ran in the 5000 m race.

Use the equation of the regression line to estimate the time he took to complete the 5000 m race. [2]

Markscheme

(t =) 0. 227703 … × 57 + 24. 3153 … (M1)

Note: Award (M1) for correct substitution into their regression line.

(t =) 37. 3 minutes (37. 2944) A1

Note: Accept 37. 1 and 37. 4 from use of 2sf and/or 3sf values.

[2 marks]

40. [Maximum mark: 8]


A group of 120 students sat a history exam. The cumulative frequency graph shows the scores obtained by the students.
(a) Find the median of the scores obtained. [1]

Markscheme

75 A1

[1 mark]

The students were awarded a grade from 1 to 5, depending on the score obtained in the exam. The number of students receiving
each grade is shown in the following table.

(b) Find an expression for a in terms of b. [2]

Markscheme

recognition that all entries add up to 120 (M1)

a = 120 − 6 − 13 − 26 − b OR a = 75 − b A1

[2 marks]

The mean grade for these students is 3. 65.


(c.i) Find the number of students who obtained a grade 5. [3]

Markscheme

6×1+13×2+26×3+(75−b)×4+b×5

120
= 3. 65 (M1)(A1)

Note: Award (M1) for attempt to substitute into mean formula, LHS expression is sufficient for the M mark. Award (A1) for correct
substitutions in one variable OR in two variables, followed by evidence of solving simultaneously with a + b = 75.

(b =) 28 A1

[3 marks]

(c.ii) Find the minimum score needed to obtain a grade 5. [2]

Markscheme

120− their part (c)(i) seen (e.g. 92 indicated on graph) (M1)

84 A1

[2 marks]

41. [Maximum mark: 5]


Arriane has geese on her farm. She claims the mean weight of eggs from her black geese is less than the mean weight of eggs from
her white geese.

She recorded the weights of eggs, in grams, from a random selection of geese. The data is shown in the table.

In order to test her claim, Arriane performs a t-test at a 10% level of significance. It is assumed that the weights of eggs are normally
distributed and the samples have equal variances.

(a) State, in words, the null hypothesis. [1]

Markscheme

EITHER

The population mean weight of eggs from (her/the) black geese is equal to/the same as the population mean weight of
H0 :

eggs from (her/the) white geese.

OR

H0 :The population mean weight of eggs from (her/the) black geese is not less than the population mean weight of eggs
from (her/the) white geese. A1
Note: Reference to the "population mean weight" must be explicit for the A1 to be awarded. The term “population” can be
implied by use of “all” or “on average” or “generally” when relating to the weight of eggs e.g. “the mean weight of eggs for all
(her/the) black geese”.
Award A0 if reference is made to the mean weights from the sample or the table.
Award A0 for a null hypothesis written in symbolic form.

[1 mark]

(b) Calculate the p-value for this test. [2]

Markscheme

p-value = 0. 177 (0. 176953 …) A2

Note: Award A1 for an answer of 0. 18221 …, from “unpooled” settings on GDC.

[2 marks]

(c) State whether the result of the test supports Arriane’s claim. Justify your reasoning. [2]

Markscheme

0. 177 > 0. 1 R1

(insufficient evidence to reject H )0

Arriane’s claim is not supported by the evidence A1

Note: Accept p > 0. 1 or p > significance level provided p is explicitly seen in part (b). Award A1 only if reference is specifically
made to Arriane's claim.
Do not award R0A1.

[2 marks]

42. [Maximum mark: 7]


A game is played where two unbiased dice are rolled and the score in the game is the greater of the two numbers shown. If the two
numbers are the same, then the score in the game is the number shown on one of the dice. A diagram showing the possible
outcomes is given below.
Let T be the random variable “the score in a game”.
(a) Complete the table to show the probability distribution of T .

[2]

Markscheme

A2

Note: Award A1 if three to five probabilities are correct.

[2 marks]

Find the probability that

(b.i) a player scores at least 3 in a game. [1]

Markscheme

32

36
(
8

9
, 0. 888888 … , 88. 9%) (A1)

[1 mark]

(b.ii) a player scores 6, given that they scored at least 3. [2]

Markscheme

use of conditional probability (M1)

e.g. denominator of 32 OR denominator of 0. 888888 …, etc.


11

32
(0. 34375, 34. 4%) A1

[2 marks]

(c) Find the expected score of a game. [2]

Markscheme

1×1+3×2+5×3+…+11×6

36
(M1)

=
161

36
(4
17

36
, 4. 47, 4. 47222 …) A1

[2 marks]

43. [Maximum mark: 4]


Deb used a thermometer to record the maximum daily temperature over ten consecutive days. Her results, in degrees Celsius (°C),
are shown below.

14, 15, 14, 11, 10, 9, 14, 15, 16, 13

For this data set, find the value of

(a) the mode. [1]

Markscheme

14 A1

[1 mark]

(b) the mean. [2]

Markscheme

14+15+…

10
(M1)

= 13. 1 A1

[2 marks]

(c) the standard deviation. [1]

Markscheme

2. 21 (2. 21133 …) A1
[1 mark]

44. [Maximum mark: 6]


A newspaper vendor in Singapore is trying to predict how many copies of The Straits Times they will sell. The vendor forms a model to
predict the number of copies sold each weekday. According to this model, they expect the same number of copies will be sold each
day.

To test the model, they record the number of copies sold each weekday during a particular week. This data is shown in the table.

A goodness of fit test at the 5% significance level is used on this data to determine whether the vendor’s model is suitable.

The critical value for the test is 9. 49 and the hypotheses are

H0 : The data satisfies the model.


H1 : The data does not satisfy the model.

(a) Find an estimate for how many copies the vendor expects to sell each day. [1]

Markscheme

(
74+97+91+86+112

5
) = 92 A1

[1 mark]

(b.i) Write down the degrees of freedom for this test. [1]

Markscheme

4 A1

[1 mark]

(b.ii) Write down the conclusion to the test. Give a reason for your answer. [4]

Markscheme

χ
2
calc
= 8. 54 (8. 54347 …) OR p-value = 0. 0736 (0. 0735802 …) A2

8. 54 < 9. 49 OR 0. 0736 > 0. 05 R1

therefore there is insufficient evidence to reject H 0 A1

(i.e. the data satisfies the model)

Note: Do not award R0A1. Accept “accept” or “do not reject” in place of “insufficient evidence to reject”.
Award the R1 for comparing their p-value with 0. 05 or their χ value with 9. 49 and then FT their final conclusion.
2
[4 marks]

45. [Maximum mark: 6]


At Springfield University, the weights, in kg, of 10 chinchilla rabbits and 10 sable rabbits were recorded. The aim was to find out
whether chinchilla rabbits are generally heavier than sable rabbits. The results obtained are summarized in the following table.

A t-test is to be performed at the 5% significance level.

(a) Write down the null and alternative hypotheses. [2]

Markscheme

(let μ c = population mean for chinchilla rabbits, μ s = population mean for sable rabbits)

H0 : μc = μs A1

H1 : μc > μs A1

Note: Accept an equivalent statement in words, must include mean and reference to “population mean” / “mean for all
chinchilla rabbits” for the first A1 to be awarded.
Do not accept an imprecise “the means are equal”.

[2 marks]

(b) Find the p-value for this test. [2]

Markscheme

p-value = 0. 0408 (0. 0408065 …) A2

Note: Award A1 for an answer of 0. 041565 …, from “unpooled” settings on GDC.

[2 marks]

(c) Write down the conclusion to the test. Give a reason for your answer. [2]

Markscheme

0. 0408 < 0. 05 . R1

(there is sufficient evidence to) reject (or not accept) H 0 A1

(there is sufficient evidence to suggest that chinchilla rabbits are heavier than sable rabbits)
Note: Do not award R0A1. Accept ‘accept H ’. 1

[2 marks]

46. [Maximum mark: 5]


The number of sick days taken by each employee in a company during a year was recorded. The data was organized in a box and
whisker diagram as shown below:

For this data, write down

(a.i) the minimum number of sick days taken during the year. [1]

Markscheme

2 A1

[1 mark]

(a.ii) the lower quartile. [1]

Markscheme

6 A1

[1 mark]

(a.iii) the median. [1]

Markscheme

8 A1

[1 mark]

(b) Paul claims that this box and whisker diagram can be used to infer that the percentage of employees who took
fewer than six sick days is smaller than the percentage of employees who took more than eleven sick days.

State whether Paul is correct. Justify your answer. [2]


Markscheme

EITHER

Each of these percentages represent approximately 25% of the employees. R1

OR

The diagram is not explicit enough to show what is happening at the quartiles regarding 6 and 11 / we do not have the data
points R1

OR

Discrete data not clear how to interpret “fewer”. R1

THEN

Hence, Paul is not correct (OR no such inference can be made). A1

Note: Do not award R0A1.

[2 marks]

47. [Maximum mark: 19]


Xavie conducted a study to see if there is a relationship between the price of an apartment, y, and its distance, x, from the city
centre of Melbourne.

They took a random sample of six typical apartments along a train line in the city. Xavie obtained the data shown in the following
table.

A plot of these data is seen in the following graph.


(a) Write down the value of the Spearman’s rank correlation coefficient, r .
s [1]

Markscheme

r s = −1 A1

[1 mark]

(b.i) Find the Pearson’s product-moment correlation coefficient, r. [2]

Markscheme

r = −0. 979 (−0. 979191 …) A2

[2 marks]

(b.ii) Use your value of r to state which two of the following would best describe the correlation between the variables.

[2]

Markscheme

strong AND negative A1A1

Note: Award at most A1A0 if additional answers are seen.

Due to the demand of the question, do not accept “negative (from the graph)” if their r value is positive.

[2 marks]

The relationship between the variables can be modelled by the regression equation y = ax + b .
(c.i) Write down the value of a. [1]

Markscheme

a = −0. 0992 (a = −0. 0992075 …) A1

[1 mark]

(c.ii) Write down the value of b. [1]

Markscheme

b = 3. 19 (b = 3. 19150 …) A1

[1 mark]

(c.iii) According to this model, state in context what the value of b represents. [1]

Markscheme

b represents the (typical) price of an apartment in the centre (of the city) A1

Note: To award the A1, some reference to “centre” or “zero distance from the city” needs to be seen.

[1 mark]

(d) Xavie uses the regression equation to estimate the price of a typical apartment located 19. 6 km from the city
centre.

(d.i) Find this estimated price. [3]

Markscheme

attempt to substitute 19. 6 for x (M1)

y = −0. 0992075 … × 19. 6 + 3. 19150 …

= 1. 25 (1. 24704 …) A1

price = 1. 25 million (AUD) (1. 24704 … million) A1

[3 marks]

(d.ii) State two reasons that Xavie might use to justify the validity of this estimate. [2]

Markscheme

interpolation R1

strong correlation. R1

[2 marks]
To verify whether this relationship applies in a different direction from the city centre, Xavie considers two locations, A and B, both
an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices
(in millions of dollars) in the following tables.

Xavie conducts a t-test, at the 5 % level of significance, to see if the mean apartment price in location A is different to the mean
apartment price in location B. They assume the population variances are the same.

For this test, Xavie takes the null hypothesis to be μ A = μB .

(e) Write down the alternative hypothesis. [1]

Markscheme

μA ≠ μB A1

[1 mark]

(f ) Find the p-value for this test. [2]

Markscheme

p = 0. 0224 (0. 0223977 …) A2

Note: Award A1 for 0. 022 (2sf )

Award A1 for an answer of p = 0. 0265 (0. 0265017 …) , from use of unpooled GDC settings.

[2 marks]

(g) State the conclusion of the test. Justify your answer. [2]

Markscheme

0. 0223977 … < 0. 05 R1

(there is sufficient evidence to) reject the null hypothesis A1

Note: Do not award R0A1.

[2 marks]

(h) State one additional assumption Xavie has made about the distributions to conduct this test. [1]

Markscheme
(the two populations are) normally distributed A1

Note: Do not accept “independent” as that applies to the samples, not the populations.

[1 mark]

48. [Maximum mark: 12]


A type of generator will only function if a particular switch is working. The generator has a main switch, A, and a ‘back up’ switch, B.

The manufacturer claims the probability of switch A failing within one month of being fitted is 0. 1 and the probability of the
cheaper switch B failing within one month is 0. 3. Whether or not a switch fails is independent of the state of the other switch.

If both switches fail, the generator needs to shut down to replace the switches. Both switches are replaced after a month of use
(whether they have failed or not) or whenever the generator needs to be shut down.

The following tree diagram shows the probabilities of a switch failing within one month of them both being replaced, assuming
the manufacturer’s claim is correct.

(a) Write down the values of a, b and c. [2]

Markscheme

,
a = 0. 9 b = 0. 3 and c = 0. 7 A2

Note: Award A1A0 if one of the values is incorrect, A0A0 otherwise.

[2 marks]

(b) Hence find the probability that the generator needs to shut down within one month of the switches being
replaced. [1]

Markscheme

(0. 1 × 0. 3 =) 0. 03 A1
[1 mark]

The owner of the generator is suspicious of the switch manufacturer’s claims, so they look back through the past 200 occasions
when the switches were replaced. The records show whether no switches, one switch or two switches had failed.

The data the owner collected are shown in the following table.

(c) Show that the expected value of no switches failing in the generator, during the last 200 occasions when the
switches were replaced, is 126. [2]

Markscheme

P (no fail)= 0. 63 A1

multiplying by 200 M1

= 126 AG

Note: Award A0M0 for a flawed approach to find P(no fail)= 0. 63 , e.g. 126

200
= 0. 63 , which is reverse engineering.

[2 marks]

(d) Perform a χ goodness of fit test at the 5 % significance level to test whether the manufacturer’s claims are correct
2

using the following hypotheses.

H0 : The manufacturer’s claims are correct.


H1 : The manufacturer’s claims are not both correct. [7]

Markscheme

EITHER

attempt to find probability one switch failing (M1)

P (one failing)= 0. 34 (A1)

OR

expected value for two switches failing = 6 (A1)

expected value for one switch failing = 200 − 126 − 6 (M1)

THEN

(A1)

degrees of freedom = 2 (A1)

Note: Award A1 for df = 2 seen anywhere and may be awarded independent of the M1 mark.
The df cannot be implied from chi sq statistic = 3. 40989

p -value 0. 182 (0. 181781 …) A1

0. 182 > 0. 05 R1

hence insufficient evidence to reject H (that the manufacturers claims are correct)
0 A1

Note: The R1A1 can be awarded as follow through within part (d) from their (explicitly labelled) incorrect p-value.

An unrealistic p-value (p ≥ 1) should preclude awarding the final R1A1.

Accept either a conclusion to not reject the null hypothesis or the manufacturers claims are correct.

Do not award R0A1.

[7 marks]

49. [Maximum mark: 16]


In a given week, the number of students in a particular primary school that were absent due to headlice (H ), influenza (I ) and/or
chickenpox (C) were recorded as follows.

The primary school has 500 students.

35 students had headlice only


20 students had influenza only
5 students had chickenpox only
4 students had headlice and influenza but not chickenpox
2 students had headlice and chickenpox but not influenza
3 students had influenza and chickenpox but not headlice
1 student had headlice, influenza and chickenpox

(a) Draw a Venn diagram to represent this information. [3]

Markscheme

A3

Note: Award A1 for 1 in correct place, A1 for 3, 2 and 4 correct, A1 for 35, 20 and 5 correct. Award at most A0A1A1 if the rectangle
is omitted. Condone the omission of the 430, as explicitly asked for in part (b).

[3 marks]

(b) Calculate the number of students who did not have headlice or influenza or chickenpox. [2]
Markscheme

35 + 4 + 20 + 3 + 1 + 2 + 5 (M1)

430 A1

Note: The 430 may be seen in the Venn diagram.

[2 marks]

A student is chosen at random from all the students in the school.

(c) Find the probability that this student has

(c.i) headlice. [2]

Markscheme

42

500
(
21

250
, 0. 084, 8. 4 %) A2

Note: Award A1 for numerator, A1 for denominator.

[2 marks]

(c.ii) influenza given that the student has headlice. [2]

Markscheme

42
5
(0. 119047 … , 11. 9 %) A2

Note: A1 for numerator, A1 for denominator.

The first A1 can be awarded for an attempt to use conditional probability with 0. 084 on the denominator.

[2 marks]

Diego is a teacher in the school. He believes that the number of students, n, who have had influenza during the first t days of the
school year, can be modelled by the function

.
kt
n(t) = 250 − 240(2) , k ∈ R

(d) Use Diego’s model to calculate the number of students who started the school year with influenza. [2]

Markscheme

substituting t = 0 into given expression (M1)

10 A1
[2 marks]

It is known that 130 students have had influenza during the first 10 days of the school year.

(e) Find the value of k. [2]

Markscheme

130 = 250 − 240(2)


10k
(M1)

k = −0. 1 A1

[2 marks]

(f ) Using this model, calculate how many days it will take for 200 students to have had influenza since the start of the
school year. [2]

Markscheme

(M1)
−0.1t
200 = 250 − 240 (2)

t = 22. 6 (22. 6303 … , 23) (days) A1

[2 marks]

By the last day of the school year, it is known that 300 students have had influenza.

(g) Comment on the appropriateness of Diego’s model. [1]

Markscheme

EITHER
model does not predict n to go above 250 / reach 300 A1

OR
250 − 240 × 2
−0.1×365
= 250 so does not reach 300 A1

OR
there is no solution to n(t) = 300 A1

OR
correct sketch graph, with 250 and/or 300 labelled, and a supporting comment A1

THEN
hence Diego’s model is not appropriate.

Note: Do not credit reasoning based on selecting arbitrary high values of t and finding the associated n value.

[1 mark]

50. [Maximum mark: 19]


A recent study found that the heights of Dutch women can be modelled by a normal distribution with mean 170. 7 cm and
standard deviation 6. 3 cm.

A Dutch woman is chosen at random.

(a) Calculate the probability that her height is

(a.i) less than 160 cm. [2]

Markscheme

P (X < 160) OR labelled sketch of region OR calc syntax with correct bounds (M1)

Note: Accept either zero or a large negative value as the lower bound.

= 0. 0447 (0. 0447149 … , 4. 47 %) A1

[2 marks]

(a.ii) between 160 cm and 170 cm. [2]

Markscheme

P (160 < X < 170) OR labelled sketch of region OR calc syntax with correct bounds (M1)

= 0. 411 (0. 411049 … , 41. 1 %) A1

Note: Award A0A2 for answers of 0. 045 and 0. 41 both given to 2 sf.

[2 marks]

27 % of Dutch women have a height of more than h metres.

(b) Calculate the value of h.

[2]

Markscheme

P (X > h) = 0. 27 OR labelled sketch of region OR calc syntax with correct bounds (M1)

= 1. 75 (m) (1. 74560 …) A1

Note: Accept 175 (cm).

[2 marks]

Janneke selects a random sample of 200 Dutch women from Amsterdam and measures their heights. She wants to determine
whether this sample could have been chosen from a normally distributed population with mean of 170. 7 cm and standard
deviation of 6. 3 cm.

She performs a χ goodness of fit test at the 5 % significance level. She begins by creating the following frequency table.
2

(c) Calculate, correct to four significant figures, the value of

(c.i) a . [1]

Markscheme

82. 21 A1

[1 mark]

(c.ii) b. [1]

Markscheme

94. 86 A1

Note: Follow through from an incorrect part (c)(i) if fourth value is found by subtracting first three values from 200. Award at
most A0A1 if both answers are not given to four significant figures.

Award A0A1 for an answer of a = 82. 2 and b = 94. 8 .

[1 mark]

The hypotheses for Janneke’s test are

H0 : the heights are drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm

H1 : the heights are not drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm

(d) Write down the degrees of freedom for this test. [1]

Markscheme

3 A1

[1 mark]

The critical value for this test is 7. 815.

(e) Perform the χ goodness of fit test and state your conclusion, justifying your reasoning.
2
[4]

Markscheme
p-value = 0. 616 (0. 615583 …) OR χ
2
= 1. 80 (1. 79702 …) A2

Note: Award A1A0 if the p-value or χ -value is given correct to 2 dp.


2

0. 615583 … > 0. 05 OR 1. 79702 … < 7. 815 R1

EITHER
fail to reject the null hypothesis A1

OR
the heights are normally distributed with mean 170. 7 cm and standard deviation 6. 3 cm A1

Note: Do not award R0A1. Condone “accept” in place of “fail to reject”.

The R1A1 can be awarded as follow through within part (e) from their (explicitly labelled) p-value or
χ -value. Accept comparison in words.
2

[4 marks]

Gundega claims that, on average, Latvian women are taller than Dutch women.

Random samples of 10 Latvian women and 10 Dutch women are chosen, and their heights are measured.

Gundega performs a t-test at the 5 % significance level. It is assumed that the populations are normally distributed and have equal
variances.

(f ) Write down the null and alternative hypotheses for this test. [2]

Markscheme

EITHER

H0 : μL = μD A1

H1 : μL > μD A1

OR
H0 : The (population) mean height of Latvian women is equal to the (population) mean height of Dutch women A1
H1 : The (population) mean height of Latvian women is greater than the (population) mean height of Dutch women A1

Note: Award at most A0A1 if the hypotheses explicitly refer to the “sample” and not the population. For H 0 : m1 = m2 and
H : m > m award A0A1.
1 1 2
[2 marks]

(g) Perform the t-test and state the conclusion, justifying your reasoning. [4]

Markscheme

p -value = 0. 673 (0. 673205 …) A2

Note: In this question the p-value is the same 3 sf value for unpooled GDC settings so will be awarded A2.

If using a two-tailed test, the answer is p-value= 0. 654 (0. 653589 …); award A1 if alternative hypothesis was correct or A2
if it follows through correctly from their alternative hypothesis (i.e. two-tailed test was penalized in part (f )).

0. 673205 > 0. 05 R1
fail to reject the null hypothesis (Gundega is not correct) A1

Note: Do not award R0A1. Condone “accept” in place of “fail to reject”.

The R1A1 can be awarded as follow through within part (g) from their (explicitly labelled) p-value. Accept comparison in
words.

[4 marks]

51. [Maximum mark: 18]


The running time, t (minutes), of 200 family movies are recorded in the following table.

(a.i) Write down the mid-interval value of 70 ≤ t < 80 . [1]

Markscheme

75 (minutes) A1

[1 mark]

(a.ii) Calculate an estimate of the mean running time of the 200 movies. [2]

Markscheme
attempt to substitute values in the mean formula with at least one mid-interval value multiplied by a corresponding
frequency (M1)

(mean =) 88. 2 (88. 15) (minutes) A1

[2 marks]

This table is used to create the following cumulative frequency graph.

(b) Use the cumulative frequency curve to estimate the interquartile range. [2]

Markscheme

9. 15 OR 84 seen (A1)

Note: These values may be seen in the working for part (c).

(IQR = 91. 5 − 84 =) 7. 5 ( minutes) A1

[2 marks]

“Star Feud” is a movie in the data set and its running time is 100 minutes.

(c) Use your answer to part (b) to estimate whether “Star Feud’s” running time is an outlier for this data. Justify your
answer. [3]

Markscheme

(upper bound =) 91. 5 + 1. 5 × 7. 5 OR 102. 75 seen A1

102. 75 > 100 OR 100 − 91. 5 < 11. 25 OR 100 − 11. 25 < 91. 5 R1

Star Feud is not an outlier A1

Note: Do not award R0A1.


[3 marks]

It is believed that the running times of family movies follow a normal distribution with mean 88 minutes and standard deviation
6. 75 minutes.

It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 movies could have plausibly
2

been drawn from an underlying distribution N (88, 2


6. 75 ) .

(d) Write down the null and the alternative hypotheses for the test. [2]

Markscheme

H0 : The running times of the movies can be modelled by N (88, 6. 75 )


2

H1 : The running times of the movies cannot be modelled by N (88, 6. 75 )


2
A1A1

Note: Award A1 for each correct hypothesis that includes a reference to normal distribution with a mean of 88 and a standard
deviation of 6. 75 (or variance of 6. 75 ). “Correlation”, “independence”, “association”, and “relationship” are incorrect.
2

Award at most A0A1 for correctly worded hypotheses that include a reference to a normal distribution but omit the
distribution’s parameters in one or both hypotheses. Award A0A1 for correct hypotheses that are reversed.

[2 marks]

As part of the test, the following table is created.

(e.i) Find the value of a and the value of b. [4]

Markscheme

2
T ~N (88, 6. 75 )

attempt to find normal probability in either correct range (M1)

P(85 ≤ T < 90) OR P(T ≥ 95)

recognition of multiplying either of their probabilities by 200 (M1)

0. 288137 … × 200 OR 0. 149859 … × 200

a = 57. 6 (57. 6274 …) , b = 30. 0 (29. 9718 …) A1A1

[4 marks]

(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]
Markscheme

d f = 4 (A1)

(p =) 0. 0166 (= 0. 0166282 …) A1

comparing their p-value to 0. 05 R1

0. 0166 < 0. 05

Note: Accept p value of 0. 0165 (= 0. 0164693 …) from using a and b to 3 sf.

(Reject H , There is sufficient evidence to say that) the data has not been drawn from the (N (88,
0
2
6. 75 )) distribution.
A1

Note: Do not award R0A1.

The conclusion to part (e)(ii) MUST follow through from their hypotheses seen in part (d); if hypotheses are
incorrect/reversed etc., the answer to part (e)(ii) must reflect this in order for the A1 to be credited.

[4 marks]

52. [Maximum mark: 18]


The heights, h, of 200 university students are recorded in the following table.

(a.i) Write down the mid-interval value of 140 ≤ h < 160 . [1]

Markscheme

150 (cm) A1

[1 mark]

(a.ii) Calculate an estimate of the mean height of the 200 students. [2]

Markscheme

attempt to substitute values in the mean formula with at least one mid-interval value multiplied by a corresponding
frequency (M1)

(mean =) 176 (176. 3) (cm) A1

[2 marks]
This table is used to create the following cumulative frequency graph.

(b) Use the cumulative frequency curve to estimate the interquartile range. [2]

Markscheme

183 OR 168 seen (A1)

Note: These values may be seen in the working for part (c).

(IQR = 183 − 168 =) 15 (cm) A1

[2 marks]

Laszlo is a student in the data set and his height is 204 cm.

(c) Use your answer to part (b) to estimate whether Laszlo’s height is an outlier for this data. Justify your answer. [3]

Markscheme

(upper bound =) 183 + 1. 5 × 15 OR 205. 5 seen A1

205. 5 > 204 OR 204 − 183 < 22. 5 OR 204 − 22. 5 < 183 R1

Laszlo’s height is not an outlier A1

Note: Do not award R0A1.

[3 marks]

It is believed that the heights of university students follow a normal distribution with mean 176 cm and standard deviation
13. 5 cm.

It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 students could have
2

plausibly been drawn from an underlying distribution N (176, 13. 5 ). 2

(d) Write down the null and the alternative hypotheses for the test. [2]
Markscheme

H0 : The heights of the students can be modelled by N (176, 2


13. 5 )

H1 : The heights of the students cannot be modelled by N (176, 13. 5 )


2
A1A1

Note: Award A1 for each correct hypothesis that includes a reference to normal distribution with a mean of 176 and a
standard deviation of 13. 5 (or variance of 13. 5 ). “Correlation”, “independence”, “association”, and “relationship” are
2

incorrect.

Award at most A0A1 for correctly worded hypotheses that include a reference to a normal distribution but omit the
distribution’s parameters in one or both hypotheses. Award A0A1 for correct hypotheses that are reversed.

[2 marks]

As part of the test, the following table is created.

(e.i) Find the value of a and the value of b. [4]

Markscheme

2
h~N (176, 13. 5 )

attempt to find normal probability in either correct range (M1)

P(170 ≤ h < 180) OR P(h ≥ 190)

recognition of multiplying either of their probabilities by 200 (M1)

0. 288137 … × 200 OR 0. 149859 … × 200

a = 57. 6 (57. 6274 …), b = 30. 0 (29. 9718 …) A1A1

[4 marks]

(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]

Markscheme

d f = 4 (A1)

(p =) 0. 0166 (= 0. 0166282 …) A1

comparing their p-value to 0. 05 R1

0. 0166 < 0. 05
Note: Accept p value of 0. 0165 (= 0. 0164693 …) from using a and b to 3 sf.

(Reject H , There is sufficient evidence to say that) the data has not been drawn from the (N (176, 13. 5
0
2
)) distribution.
A1

Note: Do not award R0A1.

The conclusion to part (e)(ii) MUST follow through from their hypotheses seen in part (d); if hypotheses are
incorrect/reversed etc., the answer to part (e)(ii) must reflect this in order for the A1 to be credited.

[4 marks]

53. [Maximum mark: 15]


The mean annual temperatures for Earth, recorded at fifty-year intervals, are shown in the table.

Year (x) 1708 1758 1808 1858 1908 1958 2008

Year °C (y) 8. 73 9. 22 9. 10 9. 12 9. 13 9. 45 9. 76

Tami creates a linear model for this data by finding the equation of the straight line passing through the points with coordinates
(1708, 8. 73) and (1958, 9. 45).

(a) Calculate the gradient of the straight line that passes through these two points. [2]

Markscheme

9.45−8.73

1958−1708
(M1)

= 0. 00288 (
9

3125
) A1

[2 marks]

(b.i) Interpret the meaning of the gradient in the context of the question. [1]

Markscheme

the (mean) yearly change in (mean annual) temperature A1

Note: Accept equivalent statements, e.g. “rate of change of temperature”.

[1 mark]

(b.ii) State appropriate units for the gradient. [1]

Markscheme
°C / year OR degrees C per year A1

Note: Do not follow through from part (b)(i) into (b)(ii).

[1 mark]

(c) Find the equation of this line giving your answer in the form y = mx + c . [2]

Markscheme

attempt to substitute point and gradient into appropriate formula (M1)

8. 73 = 0. 00288 × 1708 + c ⇒ c = 3. 81096 …

or

9. 45 = 0. 00288 × 1958 + c ⇒ c = 3. 81096 .

equation is y = 0. 00288x + 3. 81 A1

[2 marks]

(d) Use Tami’s model to estimate the mean annual temperature in the year 2000. [2]

Markscheme

attempt to substitute 2000 into their part (c) (M1)

0. 0028 × 2000 + 3. 81096 …

= 9. 57 (°C) (9. 57096 …) A1

[2 marks]

Thandizo uses linear regression to obtain a model for the data.

(e.i) Find the equation of the regression line y on x. [2]

Markscheme

y = 0. 00256x + 4. 46 (0. 00255714 … x + 4. 46454 …) (M1)A1

Note: Award (M1)A0 for answers that show the correct method, but are presented incorrectly (e.g. no “y = ” or truncated values
etc.). Accept 4. 465 as the correct answer to 4 sf.

[2 marks]
(e.ii) Find the value of r, the Pearson’s product-moment correlation coefficient. [1]

Markscheme

0. 861 (0. 861333 …) A1

[1 mark]

(f ) Use Thandizo’s model to estimate the mean annual temperature in the year 2000. [2]

Markscheme

attempt to substitute 2000 into their part (e)(i) (M1)

0. 00255714 × 2000 + 4. 46454 … .

= 9. 58 (°C) (9. 57882 … (°C)) A1

Note: Award A1 for 9. 57 from 0. 00255714 × 2000 + 4. 46.

[2 marks]

Thandizo uses his regression line to predict the year when the mean annual temperature will first exceed 15 °C.

(g) State two reasons why Thandizo’s prediction may not be valid. [2]

Markscheme

cannot (always reliably) make a prediction of x from a value of y, when using a y on x line / regression line is not x on y A1

extrapolation A1

[2 marks]

54. [Maximum mark: 17]


It is claimed that a new remedy cures 82% of the patients with a particular medical problem.

This remedy is to be used by 115 patients, and it is assumed that the 82% claim is true.

(a) Find the probability that exactly 90 of these patients will be cured. [3]

Markscheme

recognition of binomial distribution (M1)

e.g. X~B(115, 0. 82) OR binompdf(115, 0. 82, 90) etc.


((P(X = 90) =) 0. 0535 (0. 0535325 …) A2

Note: Award (M1)A1A0 for an answer of 0. 054 with or without working shown.

[3 marks]

(b) Find the probability that at least 95 of these patients will be cured. [2]

Markscheme

selecting correct region of distribution (M1)

e.g. P(X ≥ 95) OR 1 − P(X ≤ 94) OR 1− binomcdf(115, 0. 82, 94)

0. 491 (0. 491036 …) A1

[2 marks]

(c) Find the variance in the possible number of patients that will be cured. [2]

Markscheme

substitution in the variance formula for binomial distribution (M1)

115 × 0. 82 × 0. 18

17. 0 (16. 974) A1

Note: Allow 17 for the final answer.

[2 marks]

The probability that at least n patients will be cured is less than 30%.

(d) Find the least value of n. [3]

Markscheme

METHOD 1

attempt to write an expression containing n inside the brackets of P ( ) AND

including 0. 3 or 0. 7 (M1)

P(X ≥ n) < 0. 3 OR P(X ≤ n − 1) > 0. 7 (A1)

n = 98 A1
METHOD 2

using binomcdf in GDC for at least two different values of n greater than 90 (M1)

EITHER

(P(X < 97) =) 0. 696683 … AND (P(X < 98) =) 0. 778249 … (seen) (A1)

OR

(P(X > 97) =) 0. 303316 … AND (P(X > 98) =) 0. 221750 … (seen) (A1)

THEN

n = 98 A1

[3 marks]

A clinic is interested to see if the mean recovery time of their patients who tried the new remedy is less than that of their patients
who continued with an older remedy. The clinic randomly selects some of their patients and records their recovery time in days.
The results are shown in the table below.

The data is assumed to follow a normal distribution and the population variance is the same for the two groups. A t-test is used to
compare the means of the two groups at the 10% significance level.

(e) State the appropriate null and alternative hypotheses for this t-test. [2]

Markscheme

(μ : population mean recovery time for new remedy)


1

(μ : population mean recovery time for old remedy)


2

H 0 : μ 1 = μ 2 (H 0 : μ 1 − μ 2 = 0) A1

H 1 : μ 1 < μ 2 (H 1 : μ 1 − μ 2 < 0) A1
Note: Accept an equivalent statement in words, must include mean and reference to “population mean”, e.g. “mean for all
patients on old remedy”, for the first A1 to be awarded.

Do not accept an imprecise “the means are equal”.

Award A0A1 for reversed hypotheses (H 0 .


: μ1 < μ2 , H1 : μ1 = μ2 )

[2 marks]

(f ) Find the p-value for this test. [2]

Markscheme

0. 0620 (0. 0620061 …) A2

Note: Allow 0. 062 as final answer. Award A1 for an answer of 0. 06. Award A1 for an answer of 0. 0527756 … from use of
unpooled setting.

Follow through from an incorrect alternative hypothesis as long as their p-value matches their alternative hypothesis.

[2 marks]

(g) State the conclusion for this test. Give a reason for your answer. [2]

Markscheme

0. 0620 < 0. 1 R1

(sufficient evidence to) reject H 0 A1

Note: Do not award R0A1. Accept “p-value is less than 0. 1” provided an answer was seen in part (f ).

[2 marks]

(h) Explain what the p-value represents. [1]

Markscheme

the probability of obtaining results (at least as extreme) as those observed given that the null hypothesis is true A1

[1 mark]

55. [Maximum mark: 17]


Elsie, a librarian, wants to investigate the length of time, T minutes, that people spent in her library on a particular day.
(a) State whether the variable T is discrete or continuous. [1]

Markscheme

continuous A1

[1 mark]

Elsie’s data for 160 people who visited the library on that particular day is shown in the following table.

(b) Find the value of k. [2]

Markscheme

160 − 50 − 62 − 14 − 8 (M1)

(k =) 26 A1

[2 marks]

(c.i) Write down the modal class. [1]

Markscheme

20 ≤ T < 40 A1

[1 mark]

(c.ii) Write down the mid-interval value for this class. [1]

Markscheme

30 A1

[1 mark]

(d) Use Elsie’s data to calculate an estimate of the mean time that people spent in the library. [2]

Markscheme

33. 5 minutes A2
Note: FT from their value of k and their mid-interval value. Follow through from part (c)(ii) but only if mid-interval value lies in
their interval.

[2 marks]

(e) Using the table, write down the maximum possible number of people who spent 35 minutes or less in the library
on that day. [1]

Markscheme

112 A1

[1 mark]

Elsie assumes her data to be representative of future visitors to the library.

(f ) Find the probability a visitor spends at least 60 minutes in the library. [2]

Markscheme

22

160
[0. 138, 0. 1375, 13. 75%,
11

80
] A1A1

Note: Award A1 for correct numerator, A1 for correct denominator.

[2 marks]

The following box and whisker diagram shows the times, in minutes, that the 160 visitors spent in the library.

(g) Write down the median time spent in the library. [1]

Markscheme

26 minutes A1

[1 mark]

(h) Find the interquartile range. [2]


Markscheme

50 − 16 (M1)

Note: Award M1 for both correct quartiles seen.

34 minutes A1

[2 marks]

(i) Hence show that the longest time that a person spent in the library is not an outlier. [3]

Markscheme

correct substitution into outlier formula (M1)

50 + 1. 5 × 34

= 101 A1

92 < 101 OR highest value on diagram < 101 R1

not an outlier AG

Note: Award R1 for their correct comparison. Follow through from their part (h). Award R0 if their conclusion is “it is an outlier”,
this contradicts Elsie’s belief.

[3 marks]

Elsie believes the box and whisker diagram indicates that the times spent in the library are not normally distributed.

(j) Identify one feature of the box and whisker diagram which might support Elsie’s belief. [1]

Markscheme

EITHER

the diagram is not symmetric or equivalent

e.g the median is not in the center of the box or


the lengths of the whiskers are (very) different or (positive or right) skew

OR

the mean and median are (very) different; A1

[1 mark]
56. [Maximum mark: 16]
At Mirabooka Primary School, a survey found that 68% of students have a dog and 36% of students have a cat. 14% of students
have both a dog and a cat.

This information can be represented in the following Venn diagram, where m, n, p and q represent the percentage of students
within each region.

Find the value of

(a.i) m . [1]

Markscheme

(m =) 54% A1

Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.

[1 mark]

(a.ii) n . [1]

Markscheme

(n =) 14% A1

Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.

[1 mark]

(a.iii) .
p [1]

Markscheme
(p =) 22% A1

Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.

[1 mark]

(a.iv) q . [1]

Markscheme

(q =) 10% A1

Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.

[1 mark]

(b) Find the percentage of students who have a dog or a cat or both. [1]

Markscheme

90 (%) A1

Note: Award A0 for a decimal answer.

[1 mark]

Find the probability that a randomly chosen student

(c.i) has a dog but does not have a cat. [1]

Markscheme

0. 54 (
54

100
,
27

50
, 54%) A1

[1 mark]

(c.ii) has a dog given that they do not have a cat. [2]

Markscheme

54

64
(0. 844,
27

32
, 84. 4%, 0. 84375) A1A1
Note: Award A1 for a correct denominator (0. 64 or 64 seen), A1 for the correct final answer.

[2 marks]

Each year, one student is chosen randomly to be the school captain of Mirabooka Primary School.

Tim is using a binomial distribution to make predictions about how many of the next 10 school captains will own a dog. He
assumes that the percentages found in the survey will remain constant for future years and that the events “being a school captain”
and “having a dog” are independent.

Use Tim’s model to find the probability that in the next 10 years

(d.i) 5 school captains have a dog. [2]

Markscheme

recognizing Binomial distribution with correct parameters (M1)

X~B(10, 0. 68)

(P(X = 5) =) 0. 123 (0. 122940 … , 12. 3%) A1

[2 marks]

(d.ii) more than 3 school captains have a dog. [2]

Markscheme

1 − P(X ≤ 3) OR P(X ≥ 4) OR P(4 ≤ X ≤ 10) (M1)

0. 984 (0. 984497 … , 98. 4%) A1

[2 marks]

(d.iii) exactly 9 school captains in succession have a dog. [3]

Markscheme

(M1)
9
(0. 68) × 0. 32

recognition of two possible cases (M1)

9
2 × ((0. 68) × 0. 32)

0. 0199 (0. 0198957 … , 1. 99%) A1


[3 marks]

John randomly chooses 10 students from the survey.

(e) State why John should not use the binomial distribution to find the probability that 5 of these students have a
dog. [1]

Markscheme

EITHER

the probability is not constant A1

OR

the events are not independent A1

OR

the events should be modelled by the hypergeometric distribution instead A1

[1 mark]

57. [Maximum mark: 16]


The scores of the eight highest scoring countries in the 2019 Eurovision song contest are shown in the following table.

For this data, find

(a.i) the upper quartile. [2]

Markscheme

370+472

2
(M1)

Note: This (M1) can also be awarded for either a correct Q or a correct Q in part (a)(ii).
3 1
Q 3 = 421 A1

[2 marks]

(a.ii) the interquartile range. [2]

Markscheme

their part (a)(i) – their Q 1


(clearly stated) (M1)

IQR = (421 − 318 =) 103 A1

[2 marks]

(b) Determine if the Netherlands’ score is an outlier for this data. Justify your answer. [3]

Markscheme

(Q 3
+ 1. 5 (IQR) =) 421 + (1. 5 × 103) (M1)

= 575. 5

since 498 < 575. 5 R1

Netherlands is not an outlier A1

Note: The R1 is dependent on the (M1). Do not award R0A1.

[3 marks]

Chester is investigating the relationship between the highest-scoring countries’ Eurovision score and their population size to
determine whether population size can reasonably be used to predict a country’s score.

The populations of the countries, to the nearest million, are shown in the table.
Chester finds that, for this data, the Pearson’s product moment correlation coefficient is r = 0. 249.
(c) State whether it would be appropriate for Chester to use the equation of a regression line for y on x to predict a
country’s Eurovision score. Justify your answer. [2]

Markscheme

not appropriate (“no” is sufficient) A1

as r is too close to zero / too weak a correlation R1

[2 marks]

Chester then decides to find the Spearman’s rank correlation coefficient for this data, and creates a table of ranks.

Write down the value of:

(d.i) a . [1]

Markscheme

6 A1

[1 mark]

(d.ii) .
b [1]

Markscheme

4. 5 A1

[1 mark]

(d.iii) c . [1]
Markscheme

4. 5 A1

[1 mark]

(e.i) Find the value of the Spearman’s rank correlation coefficient r .


s [2]

Markscheme

r s = 0. 683 (0. 682646 …) A2

[2 marks]

(e.ii) Interpret the value obtained for r .s [1]

Markscheme

EITHER

there is a (positive) association between the population size and the score A1

OR

there is a (positive) linear correlation between the ranks of the population size and the ranks of the scores (when compared
with the PMCC of 0. 249). A1

[1 mark]

(f ) When calculating the ranks, Chester incorrectly read the Netherlands’ score as 478. Explain why the value of the
Spearman’s rank correlation r does not change despite this error.
s [1]

Markscheme

lowering the top score by 20 does not change its rank so r is unchanged
s R1

Note: Accept “this would not alter the rank” or “Netherlands still top rank” or similar. Condone any statement that clearly
implies the ranks have not changed, for example: “The Netherlands still has the highest score.”

[1 mark]

58. [Maximum mark: 15]


The aircraft for a particular flight has 72 seats. The airline’s records show that historically for this flight only 90% of the people who
purchase a ticket arrive to board the flight. They assume this trend will continue and decide to sell extra tickets and hope that no
more than 72 passengers will arrive.

The number of passengers that arrive to board this flight is assumed to follow a binomial distribution with a probability of 0. 9.
(a) The airline sells 74 tickets for this flight. Find the probability that more than 72 passengers arrive to board the
flight. [3]

Markscheme

(let T be the number of passengers who arrive)

(P(T > 72) =) P(T ≥ 73) OR 1 − P(T ≤ 72) (A1)

T ~B(74, 0. 9) OR n = 74 (M1)

= 0. 00379 (0. 00379124 …) A1

Note: Using the distribution B(74, 0. 1), to work with the 10% that do not arrive for the flight, here and throughout this
question, is a valid approach.

[3 marks]

(b.i) Write down the expected number of passengers who will arrive to board the flight if 72 tickets are sold. [2]

Markscheme

72 × 0. 9 (M1)

64. 8 A1

[2 marks]

(b.ii) Find the maximum number of tickets that could be sold if the expected number of passengers who arrive to board
the flight must be less than or equal to 72. [2]

Markscheme

n × 0. 9 = 72 (M1)

80 A1

[2 marks]

Each passenger pays $150 for a ticket. If too many passengers arrive, then the airline will give $300 in compensation to each
passenger that cannot board.

(c) Find, to the nearest integer, the expected increase or decrease in the money made by the airline if they decide to
sell 74 tickets rather than 72. [8]

Markscheme
METHOD 1

EITHER

when selling 74 tickets

top row A1A1

bottom row A1A1

Note: Award A1A1 for each row correct. Award A1 for one correct entry and A1 for the remaining entries correct.

E(I ) = 11100 × 0. 9962 … + 10800 × 0. 00338 … + 10500 × 0. 000411 ≈ 11099 (M1)A1

OR

income is 74 × 150 = 11100 (A1)

expected compensation is

0. 003380. . . ×300 + 0. 0004110. . . ×600 (= 1. 26070. . . ) (M1)A1A1

expected income when selling 74 tickets is 11100 − 1. 26070 … (M1)

= 11098. 73 … (= $11099) A1

THEN

income for 72 tickets = 72 × 150 = 10800 (A1)

so expected gain ≈ 11099 − 10800 = $299 A1

METHOD 2

for 74 tickets sold, let C be the compensation paid out

P(T = 73) = 0. 00338014 … , P(T = 74) = 0. 000411098 … A1A1

E(C) = 0. 003380 … × 300 + 0. 0004110 … × 600 (= 1. 26070. . . ) (M1)A1A1

extra expected revenue = 300 − 1. 01404 … − 0. 246658 … (300 − 1. 26070 …) (A1)(M1)

Note: Award A1 for the 300 and M1 for the subtraction.

= $299 (to the nearest dollar) A1

METHOD 3
let D be the change in income when selling 74 tickets.

(A1)(A1)

Note: Award A1 for one error, however award A1A1 if there is no explicit mention that T = 73 would result in D = 0 and the
other two are correct.

P(T ≤ 73) = 0. 9962 … , P(T = 74) = 0. 000411098 … A1A1

E(D) = 300 × 0. 9962 … + 0 × 0. 003380 … − 300 × 0. 0004110 (M1)A1A1

= $299 A1

[8 marks]

59. [Maximum mark: 17]


Mackenzie conducted an experiment on the reaction times of teenagers. The results of the experiment are displayed in the
following cumulative frequency graph.

Use the graph to estimate the

(a.i) median reaction time. [1]

Markscheme

0. 58 (s) A1
[1 mark]

(a.ii) interquartile range of the reaction times. [3]

Markscheme

0. 7 − 0. 42 (A1)(M1)

Note: Award A1 for correct quartiles seen, M1 for subtraction of their quartiles.

0. 28 (s) A1

[3 marks]

(b) Find the estimated number of teenagers who have a reaction time greater than 0. 4 seconds. [2]

Markscheme

9 (people have reaction time ≤ 0. 4 ) (A1)

31 (people have reaction time > 0. 4 ) A1

[2 marks]

(c) Determine the 90th percentile of the reaction times from the cumulative frequency graph. [2]

Markscheme

(90% × 40 =) 36 OR 4 (A1)

0. 8 s A1

[2 marks]

Mackenzie created the cumulative frequency graph using the following grouped frequency table.

(d.i) Write down the value of a. [1]


Markscheme

(a =) 6 A1

[1 mark]

(d.ii) Write down the value of b. [1]

Markscheme

(b =) 4 A1

[1 mark]

(e) Write down the modal class from the table. [1]

Markscheme

0. 6 < t ≤ 0. 8 A1

[1 mark]

(f ) Use your graphic display calculator to find an estimate of the mean reaction time. [2]

Markscheme

0. 55 s A2

[2 marks]

Upon completion of the experiment, Mackenzie realized that some values were grouped incorrectly in the frequency table. Some
reaction times recorded in the interval 0 < t ≤ 0. 2 should have been recorded in the interval 0. 2 < t ≤ 0. 4.

(g) Suggest how, if at all, the estimated mean and estimated median reaction times will change if the errors are
corrected. Justify your response. [4]

Markscheme

the mean will increase A1

because the incorrect reaction times are moving from a lower interval to a higher interval which will increase the numerator
of the mean calculation R1

the median will stay the same A1


because the median or middle of the data is greater than both intervals being changed R1

Note: Do not award A1R0.

[4 marks]

60. [Maximum mark: 16]


A group of 1280 students were asked which electronic device they preferred. The results per age group are given in the following
table.

A student from the group is chosen at random. Calculate the probability that the student

(a.i) prefers a tablet. [2]

Markscheme

560

1280
(
7

16
, 0. 4375) A1A1

Note: Award A1 for correct numerator, A1 for correct denominator.

[2 marks]

(a.ii) is 11–13 years old and prefers a mobile phone. [2]

Markscheme

1280
72
(
9

160
, 0. 05625) A1A1

Note: Award A1 for correct numerator, A1 for correct denominator.

[2 marks]

(a.iii) prefers a laptop given that they are 17–18 years old. [2]
Markscheme

153

348
(
51

116
, 0. 439655 …) A1A1

Note: Award A1 for correct numerator, A1 for correct denominator.

[2 marks]

(a.iv) prefers a tablet or is 14–16 years old. [3]

Markscheme

160 + 224 + 128 + 205 + 131 OR 560 + 512 − 224 (M1)

848

1280
(
53

80
, 0. 6625) A1A1

Note: Award A1 for correct denominator (1280) seen, (M1) for correct calculation of the numerator, A1 for the correct answer.

[3 marks]

A χ test for independence was performed on the collected data at the 1% significance level. The critical value for the test is
2

13. 277.

(b) State the null and alternative hypotheses. [1]

Markscheme

H0 : the variables are independent

H1 : the variables are dependent A1

Note: Award A1 for for both hypotheses correct. Do not accept “not correlated” or “not related” in place of “independent”.

[1 mark]

(c) Write down the number of degrees of freedom. [1]

Markscheme

4 A1

[1 mark]

(d.i) Write down the χ test statistic.


2
[2]
Markscheme


2
=) 23. 3 (23. 3258 …) A2

[2 marks]

(d.ii) Write down the p-value. [1]

Markscheme

0. 000109 (0. 000108991 …) OR 1. 09 × 10


−4
A1

[1 mark]

(d.iii) State the conclusion for the test in context. Give a reason for your answer. [2]

Markscheme

EITHER

23. 3 > 13. 277 R1

OR

0. 000109 < 0. 01 R1

THEN

(there is sufficient evidence to accept H that) preferred device and age group are not independent
1 A1

Note: For the final A1 the answer must be in context. Do not award A1R0.

[2 marks]

61. [Maximum mark: 14]


Arianne plays a game of darts.
The distance that her darts land from the centre, O, of the board can be modelled by a normal distribution with mean 10 cm and
standard deviation 3 cm.
Find the probability that

(a.i) a dart lands less than 13 cm from O. [2]

Markscheme

Let X be the random variable “distance from O”.

2
X~N(10, 3 )

P(X < 13) = 0. 841 (0. 841344 …) (M1)(A1)

[2 marks]

(a.ii) a dart lands more than 15 cm from O. [1]

Markscheme

(P(X > 15) =) 0. 0478 (0. 0477903) A1

[1 mark]

Each of Arianne’s throws is independent of her previous throws.

(b) Find the probability that Arianne throws two consecutive darts that land more than 15 cm from O. [2]

Markscheme

P(X > 15) × P(X > 15) (M1)

= 0. 00228 (0. 00228391 …) A1

[2 marks]
In a competition a player has three darts to throw on each turn. A point is scored if a player throws all three darts to land within a
central area around O. When Arianne throws a dart the probability that it lands within this area is 0. 8143.

(c) Find the probability that Arianne does not score a point on a turn of three darts. [2]

Markscheme

(M1)
3
1 − (0. 8143)

0. 460 (0. 460050 …) A1

[2 marks]

In the competition Arianne has ten turns, each with three darts.

(d.i) Find the probability that Arianne scores at least 5 points in the competition. [3]

Markscheme

METHOD 1

let Y be the random variable “number of points scored”

evidence of use of binomial distribution (M1)

Y ~B(10, 0. 539949 …) (A1)

(P(Y ≥ 5) =) 0. 717 (0. 716650 …) . A1

METHOD 2

let Q be the random variable “number of times a point is not scored”

evidence of use of binomial distribution (M1)

Q~B(10, 0. 460050 …) (A1)

(P(Q ≤ 5) =) 0. 717 (0. 716650 …) A1

[3 marks]

(d.ii) Find the probability that Arianne scores at least 5 points and less than 8 points. [2]

Markscheme

P(5 ≤ Y < 8) (M1)

0. 628 (0. 627788 …) A1

Note: Award M1 for a correct probability statement or indication of correct lower and upper bounds, 5 and 7.
[2 marks]

(d.iii) Given that Arianne scores at least 5 points, find the probability that Arianne scores less than 8 points. [2]

Markscheme

P(5≤Y <8)

P(Y ≥5)
(=
0.627788…

0.716650…
) (M1)

0. 876 (0. 876003 …) A1

[2 marks]

62. [Maximum mark: 13]


The stopping distances for bicycles travelling at 20 km h −1
are assumed to follow a normal distribution with mean 6. 76 m and
standard deviation 0. 12 m.

Under this assumption, find, correct to four decimal places, the probability that a bicycle chosen at random travelling at 20 km h −1

manages to stop

(a.i) in less than 6. 5 m. [2]

Markscheme

evidence of correct probability (M1)

e.g sketch OR correct probability statement, P(X < 6. 5)

0. 0151 A1

[2 marks]

(a.ii) in more than 7 m. [1]

Markscheme

0. 0228 A1

Note: Answers should be given to 4 decimal place.

[1 mark]

1000 randomly selected bicycles are tested and their stopping distances when travelling at 20 km h −1
are measured.

Find, correct to four significant figures, the expected number of bicycles tested that stop between
(b.i) 6. 5 m and 6. 75 m. [2]

Markscheme

multiplying their probability by 1000 (M1)

451. 7 A1

[2 marks]

(b.ii) 6. 75 m and 7 m. [1]

Markscheme

510. 5 A1

Note: Answers should be given to 4 sf.

[1 mark]

The measured stopping distances of the 1000 bicycles are given in the table.

It is decided to perform a χ goodness of fit test at the 5% level of significance to decide whether the stopping distances of
2

bicycles travelling at 20 km h −1
can be modelled by a normal distribution with mean 6. 76 m and standard deviation 0. 12 m.

(c) State the null and alternative hypotheses. [2]

Markscheme

H0 : stopping distances can be modelled by N(6. 76, 0. 12 ) 2

: stopping distances cannot be modelled by N(6. 76, 0. 12 A1A1


2
H1 )

Note: Award A1 for correct H , including reference to the mean and standard deviation. Award A1 for the negation of their H .
0 0

[2 marks]

(d) Find the p-value for the test. [3]

Markscheme

15. 1 or 22. 8 seen (M1)

0. 0727 (0. 0726542 … , 7. 27%) A2


[3 marks]

(e) State the conclusion of the test. Give a reason for your answer. [2]

Markscheme

0. 05 < 0. 0727 R1

there is insufficient evidence to reject H (or “accept H ”)


0 0 A1

Note: Do not award R0A1.

[2 marks]

63. [Maximum mark: 18]


As part of his mathematics exploration about classic books, Jason investigated the time taken by students in his school to read the
book The Old Man and the Sea. He collected his data by stopping and asking students in the school corridor, until he reached his target of
10 students from each of the literature classes in his school.

(a) State which of the two sampling methods, systematic or quota, Jason has used. [1]

Markscheme

Quota sampling A1

[1 mark]

Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this
book.

(b) Write down the median time to read the book. [1]

Markscheme

10 (hours) A1
[1 mark]

(c) Calculate the interquartile range. [2]

Markscheme

15 − 7 (M1)

Note: Award M1 for 15 and 7 seen.

8 A1

[2 marks]

Mackenzie, a member of the sample, took 25 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.

(d) Determine whether Jason is correct. Support your reasoning. [4]

Markscheme

indication of a valid attempt to find the upper fence (M1)

15 + 1. 5 × 8

27 A1

25 < 27 (accept equivalent answer in words) R1

Jason is correct A1

Note: Do not award R0A1. Follow through within this part from their 27, but only if their value is supported by a valid attempt
or clearly and correctly explains what their value represents.

[4 marks]

For each student interviewed, Jason recorded the time taken to read The Old Man and the Sea (x), measured in hours, and paired this with
their percentage score on the final exam (y). These data are represented on the scatter diagram.

(e) Describe the correlation. [1]


Markscheme

“negative” seen A1

Note: Strength cannot be inferred visually; ignore “strong” or “weak”.

[1 mark]

Jason correctly calculates the equation of the regression line y on x for these students to be

y = −1. 54x + 98. 8 .

He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1. 5 hours.

(f ) Find the percentage score calculated by Jason. [2]

Markscheme

correct substitution (M1)

y = −1. 54 × 1. 5 + 98. 8

96. 5 (%) (96. 49) A1

[2 marks]

(g) State whether it is valid to use the regression line y on x for Jason’s estimate. Give a reason for your answer. [2]

Markscheme

not reliable A1

extrapolation OR outside the given range of the data R1

Note: Do not award A1R0. Only accept reasoning that includes reference to the range of the data. Do not accept a contextual
reason such as 1. 5 hours is too short to read the book.

[2 marks]

Jason found a website that rated the ‘top 50’ classic books. He randomly chose eight of these classic books and recorded the
number of pages. For example, Book H is rated 44th and has 281 pages. These data are shown in the table.

Jason intends to analyse the data using Spearman’s rank correlation coefficient, r . s
(h) Copy and complete the information in the following table.

[2]

Markscheme

A1A1

Note: Do not award A1 for correct ranks for ‘number of pages’. Award A1 for correct ranks for ‘top 50 rating’.

[2 marks]

(i.i) Calculate the value of r .


s [2]

Markscheme

0. 714 (0. 714285 …) A2

Note: FT from their table.

[2 marks]

(i.ii) Interpret your result. [1]

Markscheme

EITHER

there is a (strong/moderate) positive association between the number of pages and the top 50 rating. A1

OR

there is a (strong/moderate) agreement between the rank order of number of pages and the rank order top 50 rating. A1

OR

there is a (strong/moderate) positive (linear) correlation between the rank order of number of pages and the rank order top
50 rating. A1

Note: Follow through from their value of r .s


[1 mark]

64. [Maximum mark: 14]


It is known that the weights of male Persian cats are normally distributed with mean 6. 1 kg and variance 0. 5 2
kg
2
.

(a) Sketch a diagram showing the above information. [2]

Markscheme

A1A1

Note: Award A1 for a normal curve with mean labelled 6. 1 or μ, A1 for indication of SD (0. 5) : marks on horizontal axis at 5. 6
and/or 6. 6 OR μ − 0. 5 and/or μ + 0. 5 on the correct side and approximately correct position.

[2 marks]

(b) Find the proportion of male Persian cats weighing between 5. 5 kg and 6. 5 kg. [2]

Markscheme

2
X~N(6. 1, 0. 5 )

P(5. 5 < X < 6. 5) OR labelled sketch of region (M1)

= 0. 673 (0. 673074 …) A1

[2 marks]

A group of 80 male Persian cats are drawn from this population.

(c) Determine the expected number of cats in this group that have a weight of less than 5. 3 kg. [3]

Markscheme

(P(X < 5. 3) =) 0. 0547992 … (A1)

0. 0547992 … × 80 (M1)

= 4. 38 (4. 38393 …) A1

[3 marks]

(d) It is found that 12 of the cats weigh more than x kg. Estimate the value of x. [3]
Markscheme

0. 15 OR 0. 85 (A1)

P(X > x) = 0. 15 OR P(X < x) = 0. 85 OR labelled sketch of region (M1)

6. 62 (6. 61821 …) A1

[3 marks]

(e) Ten of the cats are chosen at random. Find the probability that exactly one of them weighs over 6. 25 kg. [4]

Markscheme

(P(X > 6. 25) =) 0. 382088 … (A1)

recognition of binomial (M1)

e.g. B(10, 0. 382088 …)

0. 0502 (0. 0501768 …) A2

[4 marks]

65. [Maximum mark: 19]


A medical centre is testing patients for a certain disease. This disease occurs in 5% of the population.

They test every patient who comes to the centre on a particular day.

(a) State the sampling method being used. [1]

Markscheme

convenience sampling (A1)

[1 mark]

It is intended that if a patient has the disease, they test “positive”, and if a patient does not have the disease, they test “negative”.

However, the tests are not perfect, and only 99% of people who have the disease test positive. Also, 2% of people who do not
have the disease test positive.

The tree diagram shows some of this information.


Write down the value of

(b.i) a . [1]

Markscheme

95% A1

[1 mark]

(b.ii) b. [1]

Markscheme

1% A1

[1 mark]

(b.iii) c . [1]

Markscheme

2% A1

[1 mark]

(b.iv) d . [1]

Markscheme

98% A1

[1 mark]

Use the tree diagram to find the probability that a patient selected at random

(c.i) will not have the disease and will test positive. [2]

Markscheme

0. 95 × 0. 02 (M1)
0. 019 A1

[2 marks]

(c.ii) will test negative. [3]

Markscheme

0. 05 × 0. 01 + 0. 95 × 0. 98 (M1)(M1)

Note: Award M1 for summing two products and M1 for correct products seen.

0. 932 (0. 9315) A1

[3 marks]

(c.iii) has the disease given that they tested negative. [3]

Markscheme

recognition of conditional probability (M1)

0.05×0.01

0.05×0.01+0.95×0.98
A1

0. 000537 (0. 000536768 …) A1

Note: Accept 0. 000536 if 0. 932 used.

[3 marks]

(d) The medical centre finds the actual number of positive results in their sample is different than predicted by the
tree diagram. Explain why this might be the case. [1]

Markscheme

EITHER
sample may not be representative of population A1

OR
sample is not randomly selected A1

OR
unrealistic to think expected and observed values will be exactly equal A1

[1 mark]
The staff at the medical centre looked at the care received by all visiting patients on a randomly chosen day. All the patients
received at least one of these services: they had medical tests (M ), were seen by a nurse (N ), or were seen by a doctor (D). It was
found that:

78 had medical tests,


45 were seen by a nurse;
30 were seen by a doctor;

9 had medical tests and were seen by a doctor and a nurse;

18 had medical tests and were seen by a doctor but were not seen by a nurse;

11 patients were seen by a nurse and had medical tests but were not seen by a doctor;

2 patients were seen by a doctor without being seen by nurse and without having medical tests.

(e) Draw a Venn diagram to illustrate this information, placing all relevant information on the diagram. [3]

Markscheme

A1A1A1

Note: Award A1 for rectangle and 3 labelled circles and 9 in centre region; A1 for 2, 40, 24 ; A1 for 18, 1, and 11.

[3 marks]

(f ) Find the total number of patients who visited the centre during this day. [2]

Markscheme

18 + 9 + 1 + 11 + 2 + 40 + 24 (M1)

105 A1

Note: Follow through from the entries on their Venn diagram in part (e). Working required for FT.

[2 marks]

© International Baccalaureate Organization, 2025

You might also like