Topic4 Revision Session Worksheet Markscheme
Topic4 Revision Session Worksheet Markscheme
1. [Maximum mark: 6]
The formula F = 1. 8C + 32 is used to convert a temperature in degrees Celsius, C , to degrees Fahrenheit, F .
(a.i) Find a formula for converting a temperature in degrees Fahrenheit to degrees Celsius. [2]
Markscheme
C =
5
9
(F − 32) (C =
F −32
1.8
, C = 0. 556F − 17. 8) A1
[2 marks]
(a.ii) Find the temperature in degrees Celsius that is recorded as 77 degrees Fahrenheit. [1]
Markscheme
C = (
77−32
1.8
º
=) 25 ( C) A1
[1 mark]
Over one year, the mean daily temperature in Mexico City was calculated to be 17 degrees Celsius with a standard deviation of 9
degrees Celsius.
Markscheme
(1. 8 × 17 + 32 =) 62. 6 ( F) º A1
[1 mark]
(b.ii) the standard deviation of the daily temperature in Mexico City. [2]
Markscheme
(1. 8 × 9 =) 16. 2 ( F) º A1
2. [Maximum mark: 7]
The following data show the heights, in metres, of six players in a basketball team.
Markscheme
1. 96 (m) A2
Note: Award A1 for substitution into the formula for the mean
e.g. 1.67+1.60+1.68+…
6
.
[2 marks]
Markscheme
[1 mark]
Markscheme
2. 31 (m) A1
[1 mark]
Markscheme
2. 31 − 1. 60 (M1)
0. 71 (m) A1
[2 marks]
A new player, Gheorghe, joins the team. Their height is measured as 1. 98 metres to the nearest centimetre.
Markscheme
[1 mark]
3. [Maximum mark: 6]
A teacher surveys their students to find out if they have eaten at the local Thai and Indian cafés. The results of the survey are shown
in the following Venn diagram.
Markscheme
33 A1
[1 mark]
(b) Write down the number of students who have not eaten at the Indian café. [1]
Markscheme
12 A1
[1 mark]
(c) Find the probability this student has eaten at both the Thai café and the Indian café. [1]
Markscheme
13
33
(0. 394, 0. 393939 … , 39. 4 %) A1
[1 mark]
Markscheme
(P (T ∪ I ) =)
31
33
(0. 939, 0. 939393 … , 93. 9%) A1
Note: For A1(ft) to be awarded, the numerator must be 31 and the denominator must be their answer to part (a).
[1 mark]
(e) State whether the events T and I are mutually exclusive. Justify your answer. [2]
Markscheme
P(T ∩ I ) ≠ 0 OR n(T ∩ I ) ≠ 0 R1
Accept an equivalent statement in words such as “some (13) students went to both cafes” or “students could go to both cafes”.
Condone P(T and I ) ≠ 0 OR n(T and I ) ≠ 0
[2 marks]
4. [Maximum mark: 8]
Zac raises funds for a library by running a game where players spin a needle. The final position of the needle results in an outcome
where a player wins or loses money. The outcomes, with associated probabilities, are shown in the following diagram.
Let X represent the amount that a player of this game wins.
(a.i) Find the expected value of X. [2]
Markscheme
($) − 2. 8 A1
[2 marks]
Markscheme
Do not accept:
[1 mark]
To encourage a person to keep playing this game, Zac increases the winning prize for the second game they play from $5 to $6.
For each successive game they play, the winning prize continues to increase by $1.
Markscheme
E(X) = 0 OR 2.80
0.40
(M1)
EITHER
E(X) for game 1 = −2. 80, E(X) for game 2 = −2. 40 , etc.
OR (k =)
2.80
0.40
+ 1
k = 8 (games) A1
OR
(w =) ($) 12 (A1)
k = 8 (games) A1
[4 marks]
(b.ii) Explain why Zac expects to raise money from the games Emily plays. [1]
Markscheme
E(X) < 0 for each (any) of the first 7 games (or equivalent) R1
[1 mark]
5. [Maximum mark: 6]
Jerry makes handcrafted chocolates. On average, 1 in 25 of the chocolates that Jerry makes is flawed. Whether or not a chocolate is
flawed is independent of all other chocolates.
Markscheme
[2 marks]
Markscheme
[2 marks]
Jerry sells the perfect chocolates for 50 pesos each and the flawed ones for 15 pesos each.
(b) Calculate the expected number of pesos Jerry makes from selling a batch of 20 randomly selected chocolates. [2]
Markscheme
= 972 (pesos) A1
[2 marks]
6. [Maximum mark: 7]
The prices, in dollars, of 10 different garden chairs are:
Markscheme
identifying the largest and smallest values: ($) 255, ($) 49 (M1)
($) 206 A1
[2 marks]
Markscheme
($) 137 (137. 1) (M1)A1
[2 marks]
Markscheme
Note: The (M1) mark is for correct GDC use and hence can be awarded if either of the values is correct. An answer of
78. 4976 … in (b)(ii) is awarded A0 but is sufficient to credit the (M1).
[1 mark]
Markscheme
[1 mark]
Markscheme
Note: If their answer to part (c)(ii) is incorrect, it should match their answer to part (b)(ii) to be awarded A1(FT).
[1 mark]
7. [Maximum mark: 5]
Sunita sorts 300 peppers into sizes of small, medium or large. Some peppers are red, some are green, and some are yellow.
Sunita wants to test, at the 5 % significance level, whether the size of the peppers is independent of the colour.
(a) State the null and alternative hypotheses for this test. [1]
Markscheme
Note: Award A1 for both hypotheses correct. Accept “not associated” in place of independent. Do not accept “correlated” or
“related” or “affected”.
[1 mark]
(b.i) Calculate χ 2
calc
. [2]
Markscheme
χ
2
calc
= 22. 5 (22. 5483 …) A2
[2 marks]
(b.ii) State a conclusion to the test. Give a reason for your answer. [2]
Markscheme
Accept χ 2
calc
> χ
2
crit
or p < sig level provided their χ 2
calc
value or p-value is seen.
[2 marks]
8. [Maximum mark: 7]
Gustav plays a game in which he first tosses an unbiased coin and then rolls an unbiased six-sided die.
If the coin shows tails, the score on the die is Gustav’s final number of points.
If the coin shows heads, one is added to the score on the die for Gustav’s final number of points.
(a) Find the probability that Gustav’s final number of points is 7. [2]
Markscheme
recognizing that only way to score 7 is to achieve a head and a 6 on die (M1)
e.g. 1
6
and 1
2
seen in an attempt to combine probabilities
(
1
6
×
1
2
=)
1
12
(0. 0833333 …) A1
[2 marks]
[3]
Markscheme
(2(
1
6
×
1
2
) =)
2
12
(
1
6
, 0. 167, 0. 16666 …) A1
Note: Award these marks for equivalent working for the 2, 3, 4 or 6 point scenarios.
A1
Note: Award A1 for a completely correct table. Award at most (M1)A1A0 if their follow-through answer from part (a) leads to a
total probability not equal to 1.
[3 marks]
(c) Calculate the expected value of Gustav’s final number of points. [2]
Markscheme
EITHER
1 1 1 1
1 × + 2 × + … + 6 × + 7 ×
12 6 6 12
OR
THEN
(expected value =) 4 A1
Note: Accept 4. 01 (4. 00640 …) from use of their 3 sf values from (b).
[2 marks]
9. [Maximum mark: 7]
Billy is a keen walker who keeps a record of his performance. The following table shows the time, in minutes, it takes him to walk
one kilometre up hills with different gradients. The gradient of each hill is constant.
[2]
Markscheme
Note: Award A1 for correct values of a and b, A1 for an equation using these correct values.
[2 marks]
(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]
Markscheme
(r =) 0. 994 (= 0. 993910 …) A1
[2 marks]
Markscheme
[2 marks]
This morning, Billy walked one kilometre up a hill, and it took 22 minutes.
(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]
Markscheme
EITHER
using the T on G regression line cannot (always) reliably make a prediction for G R1
OR
OR
OR
[1 mark]
(a) Identify which two of the following statements must be true according to the box and whisker diagram. Indicate
your choices by placing tick marks in the second column of the following table.
[2]
Markscheme
A1A1
[2 marks]
At the end of the year, Mrs Whitehouse surveyed a random sample of students from each of her two large classes to determine how
satisfied they were with her teaching.
Each student independently selected a value from 1 to 10, with 1 meaning that they were not satisfied at all and 10 meaning that
they were very satisfied.
Mrs Whitehouse believes that there was no difference in the general satisfaction between the two classes. She assumes that the
data is drawn from a population that can be modelled by a normal distribution and proposes to conduct a t-test at the 5 %
significance level.
(b) Write down the null and alternative hypotheses for her test. [2]
Markscheme
EITHER
H0 : μ1 = μ2 A1
H1 : μ1 ≠ μ2 A1
OR
H0 : μA = μB A1
H1 : μA ≠ μB A1
Note: Accept an equivalent statement in words, but must include reference to “population mean” / “mean for class A and class
B” for the A1 to be awarded.
[2 marks]
Markscheme
[2 marks]
(d) Write down the conclusion to the test. Give a reason for your answer. [2]
Markscheme
0. 0952 > 0. 05 R1
Note: Do not award R0A1. The answer to part (d) MUST follow through from their hypotheses seen in part (b) and their p-value
seen in part (c); if hypotheses are incorrect/reversed, etc., the answer to part (d) must reflect this in order for the A1 to be
credited.
[2 marks]
(a) Complete the tree diagram by writing in the three missing probabilities.
[2]
Markscheme
A1A1
Note: Award A1 for completing first set of branches, A1 for completing second set of branches.
[2 marks]
(b) Find the probability that Rita does not win a prize. [2]
Markscheme
2 1
×
3 2
=
1
3
(= 0333 …) A1
[2 marks]
(c) Given that Rita won a prize, find the probability that she got a five or six when she rolled the die. [3]
Markscheme
EITHER
1
1
+ (
3
2
×
1
)
M1A1
3 3 2
OR
1
1 −
3
1
M1A1
3
Note: Award M1 for recognizing conditional probability, A1 for correct substitution.
THEN
=
1
2
A1
[3 marks]
(a) Find the probability that Nicole’s car starts on exactly three mornings in a particular 5 day workweek. [2]
Markscheme
[2 marks]
Nicole walks to work on mornings when her car does not start and it is not raining. Nicole takes the bus to work on mornings when
her car does not start and it is raining.
Where Nicole lives, there is a 42 % probability of rain on any given morning, independent of any other morning. The probability of
Nicole’s car starting is independent of the weather.
(b) Find the probability that Nicole will not have to take the bus in a particular workweek. [4]
Markscheme
attempt to find the probability of taking a bus, (or not taking a bus); (M1)
EITHER
OR
(1 − 0. 1176)
5
OR (0. 8824)
5
seen (A1)
THEN
[4 marks]
13. [Maximum mark: 6]
Thurston believes that more popular musical artists sell more albums.
He begins to investigate this belief by randomly selecting eight musical artists and collecting data on the number of followers each
of the artists has on a particular social media platform. He then collects data on the number of albums each artist sold in the first
week after releasing an album. His data is shown in Table 1.
Table 2
[1]
Markscheme
A1
[1 mark]
Markscheme
Thurston believes that artists with a higher number of social media followers sell more albums in the first week. He carries out a
hypothesis test using a 10 % significance level with the following null hypothesis:
H0 : In the population, there is no monotonic relationship between the number of social media followers and the number of
albums sold in the first week.
Markscheme
In the population, there is a positive monotonic relationship between the number of social media followers and the
(H 1 :)
[1 mark]
(d) State the conclusion of the hypothesis test, giving a reason. [2]
Markscheme
[2 marks]
Markscheme
Note: Award A1 for correct values of a and b, A1 for an equation using these correct values.
[2 marks]
(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]
Markscheme
(r =) 0. 996 (= 0. 996247 …) A1
[2 marks]
(b) Estimate the time it will take Joel to ride one kilometre up the hill. [2]
Markscheme
[2 marks]
This morning, Joel rode one kilometre up a hill, and it took 22 minutes.
(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]
Markscheme
EITHER
using the T on G regression line cannot (always) reliably make a prediction for G R1
OR
OR
OR
[1 mark]
The table shows results for these two events at the World Championships.
Event Rank
Athlete’s Long Jump High Jump Long Jump High Jump
Country (m) (m) Rank Rank
Germany 7. 64 2. 11 1
France 7. 52 2. 08 2
Estonia 7. 49 1. 84 3
Canada 7. 44 2. 02 4
Netherlands 7. 33 2. 05 5
Ukraine 7. 28 2. 02 6
Algeria 7. 22 1. 90 7
Austria 7. 11 1. 87 8
Grenada 6. 98 1. 99 9
Japan 6. 64 1. 96 10
The Spearman’s rank correlation coefficient is used to determine if there is a linear correlation between an athlete’s ranking in long
jump and their ranking in high jump.
(a) Complete the table to show the athletes’ rankings in high jump. [2]
Markscheme
Event Rank
Athlete’s Long Jump High Jump Long Jump High Jump
Country (m) (m) Rank Rank
Germany 7. 64 2. 11 1 1
France 7. 52 2. 08 2 2
Estonia 7. 49 1. 84 3 10
Canada 7. 44 2. 02 4 4. 5
Netherlands 7. 33 2. 05 5 3
Ukraine 7. 28 2. 02 6 4. 5
Algeria 7. 22 1. 90 7 8
Austria 7. 11 1. 87 8 9
Grenada 6. 98 1. 99 9 6
Japan 6. 64 1. 96 10 7
A1A1
Note: Award A1 for ranking of tied heights, A1 for correct ranking of non-tied heights.
[2 marks]
Markscheme
[2 marks]
The following guide is used by the coach to determine the strength of the correlation between the ranks for long jump and high
jump.
|r s | Strength
0. 000 to 0. 199 Very weak
0. 200 to 0. 399 Weak
0. 400 to 0. 599 Moderate
0. 600 to 0. 799 Strong
0. 800 to 1. 000 Very strong
(c) State the strength of the correlation between the rankings as indicated by the table and interpret this in the
context of the question. [2]
Markscheme
moderate (correlation) A1
as long jump ranking increases, high jump ranking will (likely) increase A1
[2 marks]
monolingual people (μ ). Carys gave a memory retention test to a random sample of students in her class. The results are shown in
m
Carys performs a one-tailed t-test at a 5% level of significance. It is assumed that the scores are normally distributed and the
samples have equal variances.
H0 : μb = um A1
H1 : μb > um A1
Note: Accept equivalent statements in words such as “the mean score of bilingual people equals the mean score of
monolingual people”.
[2 marks]
Markscheme
[2 marks]
(c) State the conclusion of the test in the context of the question. Justify your answer. [2]
Markscheme
(fail to reject H ) there is insufficient evidence to suggest that bilingual people have better memory retention than
0
monolingual people A1
The answer to part (c) MUST be consistent with their hypotheses and their p-value.
[2 marks]
A speed of 75. 7 km h −1
is two standard deviations from the mean.
(a) Find the standard deviation for the speed of the cars. [2]
Markscheme
4. 2 (km h
−1
) A1
[2 marks]
Speeding tickets are issued to all drivers travelling at a speed greater than 72 km h −1
.
(b) Find the probability that a randomly selected driver who passes the speed camera receives a speeding ticket. [2]
Markscheme
e.g., sketch of normal distribution curve with 72 labelled to the right of the mean OR Normal CDF calculation using 72
[2 marks]
(c) Show that the region of the normal distribution between p and q is not symmetrical about the mean. [3]
Markscheme
(
P 67. 3 <speed< 74) OR Normal CDF(67. 3, 74, 67. 3, 4. 2 ) OR sketch of normal distribution with 67. 3 and 74 labelled
and shaded between (M1)
attempt to calculate probability that speed < p and speed> q with q = 74 (M1)
P (speed< )
74 = 0. 944670 …
EITHER
OR
P (speed< )
74 = 0. 945 (0. 944670 …) A1
(
P 60. 6 < speed< 74 ) OR Normal CDF(60. 6, 74, 67. 3, 4. 2 ) OR
(
P 67. 3 < speed< 74 ) OR Normal CDF(67.3, 74, 67.3, 4.2)
EITHER
OR
[3 marks]
18. [Maximum mark: 7]
In a school, 200 students solved a problem in a mathematics competition. Their times to solve the problem were recorded and the
following cumulative frequency graph was produced.
Markscheme
38 (s) A1
[1 mark]
Markscheme
32 (s) A1
[1 mark]
(a.iii) the upper quartile; [1]
Markscheme
42 (s) A1
[1 mark]
Markscheme
10 (s) A1
[1 mark]
Markscheme
1. 5× IQR (M1)
(32 − 1. 5 × 10 =) 17 (s) A1
Note: Do not award the R1 unless an explicit comparison of 14 and their 17 is seen.
e.g. 14 < 17
[3 marks]
Age (years) 13 17 22 18 19 25 11 36
Time (seconds) 13. 4 14. 6 13. 4 12. 9 12. 0 11. 8 17. 0 13. 1
Sung-Jin decides to calculate the Spearman’s rank correlation coefficient for his set of data.
(a) Complete the table of ranks.
Athlete A B C D E F G H
Age rank 3
[2]
Time rank 1
Markscheme
Athlete A B C D E F G H
Age rank 7 6 3 5 4 2 8 1
Time rank 3. 5 2 3. 5 6 7 8 1 5
A1A1
[2 marks]
Markscheme
Note: Only follow through from an incorrect table provided the ranks are all between 1 and 8.
Award A1 for −0. 67 OR for the omission of the negative sign, e.g. 0. 671 (0. 670670 …) or 0. 67
[2 marks]
Markscheme
(A value of r s = −0. 671 ) indicates a negative correlation between a person’s age and the best time they take to run 100 m.
R1
Note: Condone any comment that includes “weak” or “strong” etc. Accept an interpretation in words, but only if there is a
general link described and not a rule: “The older a person gets, the faster they tend to run”. Answer must be in context.
[1 mark]
(d) Suggest a mathematical reason why Sung-Jin may have decided not to use Pearson’s product-moment
correlation coefficient with his data from the original table. [1]
Markscheme
The correlation, such that it is, is unlikely to be linear for this type of data.
Sung-Jin is not sure the data is drawn from a bivariate normal distribution
[1 mark]
Grade 1 2 3 4 5 6 7
Frequency 1 4 7 9 p 9 4
Markscheme
34 + p A1
[1 mark]
Markscheme
(p =) 10 A1
[3 marks]
Their quality assurance team randomly selects 500 items of food to inspect. The quality of this food is classified as perfect,
satisfactory, or poor. The data is summarized in the following table.
(a) Find the probability that its quality is not perfect, given that it is from breakfast. [2]
Markscheme
232
, 56. 4655 … %) A1A1
[2 marks]
A χ test at the 5% significance level is carried out to determine if there is significant evidence of a difference in the quality of the
2
Markscheme
[2 marks]
(c) State, with justification, the conclusion for this test. [2]
Markscheme
EITHER
OR
THEN
EITHER
OR
(there is significant evidence that) the (food) quality and the type of meal are not independent A1
Award R1 for χ 2
calc
> χ
2
crit
, provided the calculated value is explicitly seen in part (b).
Accept “p-value < significance level” provided their p-value is seen and their p-value is between 0 and 1.
[2 marks]
(a) Calculate the probability that the length of the seed is less than 3. 7 cm. [2]
Markscheme
2
X~N(4, 0. 25 )
EITHER
P(X < 3. 7)
Note: Accept a weak or strict inequality, and any label instead of X, e.g. length or L.
OR
normal curve with vertical line, left of mean, labelled 3. 7, and shaded region (M1)
THEN
[2 marks]
It is known that 30% of the seeds have a length greater than k cm.
Markscheme
EITHER
Note: Accept a weak or strict inequality, and any label instead of X e.g., length or L.
OR
normal curve with vertical line to the right of the mean and shaded region, correctly labelled either 0. 3 or 0. 7 (M1)
THEN
(k =) 4. 13 (4. 13110 …) A1
[2 marks]
Markscheme
EITHER
OR
normal curve with vertical lines symmetrical about the mean line with a correct indication of an area of 0. 6 or 0. 2 or 0. 8
(M1)
THEN
[2 marks]
x 0 1 2 3 4 5
P(X = x) 0. 15 0. 2 k 0. 16 2k 0. 25
Markscheme
0. 15 + 0. 2 + k + 0. 16 + 2k + 0. 25 = 1 (M1)
k = 0. 08 A1
[2 marks]
The player has a chance to win money based on how many times they hit the target.
The gain for the player, in $, is shown in the following table, where a negative gain means that the player loses money.
x 0 1 2 3 4 5
(b) Determine whether this game is fair. Justify your answer. [3]
Markscheme
= −0. 12 A1
[3 marks]
Markscheme
0. 2 = 0. 8(0. 2 + x) (A1)
x = 0. 05 A1
[3 marks]
Markscheme
x + 0. 2 + 0. 6 + y = 1 (M1)
y = 0. 15 A1
[2 marks]
Markscheme
25.
METHOD 1
(a)
0.15
0.2
METHOD 2
P(R′|S′) = P(R′)
Markscheme
[2 marks]
A1
= 1 − 0. 25 = 0. 75
∣S′) =
P(R′∩S′)
P(S′)
(M1)
A1
[2 marks]
[Maximum mark: 5]
(M1)
Roy is a member of a motorsport club and regularly drives around the Port Campbell racetrack.
The times he takes to complete a lap are normally distributed with mean 59 seconds and standard deviation 3 seconds.
Find the probability that Roy completes a lap in less than 55 seconds.
A1
Note: Award M1 for a correct calculator notation such as normal cdf (0, 55, 59, 3) or normal cdf (−1
Roy will complete a 20 lap race. It is expected that 8. 6 of the laps will take more than t seconds.
(b)
Markscheme
Find the value of t.
8. 6 = 20 × p OR (p =) 0. 43 seen (M1)
99
, 55, 59, 3) .
[2]
[3]
EITHER
OR
THEN
[3 marks]
(a) Taizo plays two games that are independent of each other. Find the probability that Taizo knocks over a total of
two bottles. [4]
Markscheme
0. 5 × 0. 1 + 0. 4 × 0. 4 + 0. 1 × 0. 5 (M1)(M1)(M1)
0. 26 A1
[4 marks]
In any given game, Taizo will win k points if he knocks over two bottles, win 4 points if he knocks over one bottle and lose 8 points
if no bottles are knocked over.
(b) Find the value of k such that the game is fair. [3]
Markscheme
0 = −8 × 0. 5 + 4 × 0. 4 + 0. 1k (M1)(M1)
Note: Award M1 for correct substitution into the formula for expected value, award M1 for the expected value formula equated
to zero.
(k =) 24 (points) A1
[3 marks]
Markscheme
[1 mark]
Markscheme
χ
2
= 2. 27 (2. 26821 …) A2
[2 marks]
(c) Write down Sergio’s conclusion to the test in context. Justify your answer. [2]
Markscheme
EITHER
OR
THEN
Insufficient evidence (at the 10% significance level) that the favourite berry depends on income level. A1
Note: Do not award R0A1. Accept “χ ” in place of their “2. 27”, provided an answer was seen in part (b). Their conclusion must
2
[2 marks]
Annabelle uses these scores to conduct a two-tailed t-test to compare the means of the two classes, at the 5% level of significance.
It is assumed the examination scores for both classes have the same variance and are normally distributed.
The null hypothesis is μ 1 = μ2 , where μ is the mean examination score from Manny’s class and μ is the mean examination score
1 2
Markscheme
(H 1 :) μ 1 ≠ μ 2 A1
Note: Accept an equivalent statement in words referring to μ and μ as defined in the question.
1 2
[1 mark]
(b) Find the p-value for this test. Give your answer correct to five decimal places. [2]
Markscheme
[2 marks]
(c) State whether Annabelle’s conclusion is correct. Give a reason for your answer. [2]
Markscheme
Note: Do not award R0A1. Answer must reference Annabelle’s conclusion; do not accept an answer, without context, of “fail to
reject H ” for the A1 mark.
0
[2 marks]
The number of trees to be planted in each of the first three months are shown in the following table.
(a) Find the number of trees to be planted in the 15th month. [3]
Markscheme
use of the n th
term of an arithmetic sequence formula (M1)
u 15 = 85 + (15 − 1) × 30 (A1)
505 A1
[3 marks]
(b) Find the total number of trees to be planted in the first 15 months. [2]
Markscheme
S 15 =
15
2
(85 + 505) OR 15
2
(2 × 85 + (15 − 1) × 30)
4430 (4425) A1
[2 marks]
(c) Find the mean number of trees planted per month during the first 15 months. [2]
Markscheme
4425
15
OR 85 + (8 − 1) × 30 (M1)
295 A1
Note: Accept 295. 333 … from use of 3sf value from part (b).
[2 marks]
(a) Write down the percentage of bags that weigh more than 500 g. [1]
Markscheme
50% A1
2
.
[1 mark]
A bag that weighs less than 495 g is rejected by the factory for being underweight.
(b) Find the probability that a randomly chosen bag is rejected for being underweight. [2]
Markscheme
(c) A bag that weighs more than k grams is rejected by the factory for being overweight. The factory rejects 2% of
bags for being overweight.
Markscheme
[3 marks]
Markscheme
6
OR probabilities are equal
6
OR probabilities are not equal A1
[1 mark]
Markscheme
5 A1
[1 mark]
(c) Write down the expected frequency of rolling a 1. [1]
Markscheme
10 A1
[1 mark]
Markscheme
[2 marks]
(e) State the conclusion of the test. Give a reason for your answer. [2]
Markscheme
0. 287 > 0. 05 R1
EITHER
OR
Note: Do not award R0A1. Condone “accept the null hypothesis” or “the die is fair”. Their conclusion must be consistent with their
p-value and their hypothesis.
[2 marks]
Markscheme
A1
[1 mark]
(b) Find the probability that Karl takes two socks of the same colour. [2]
Markscheme
3 2 4 3
× + ×
7 6 7 6
=
18
42
(=
3
7
≈ 0. 429 (42. 9%)) A1
[2 marks]
(c) Given that Karl has two socks of the same colour find the probability that he has two brown socks. [3]
Markscheme
(
3
6
)
A1
7
=
18
6
(=
1
3
) (
252
756
, 0. 333, 33. 3%) A1
[3 marks]
To gather data, each driver was put in a car simulator and asked to either talk on a mobile phone or talk to a passenger. Each driver
was instructed to apply the brakes as soon as they saw a red light appear in front of the car. The reaction times of the drivers, in
seconds, were recorded, as shown in the following table.
At the 10% level of significance, a t-test was used to compare the mean reaction times of the two groups. Each data set is assumed
to be normally distributed, and the population variances are assumed to be the same.
Let μ and μ be the population means for the two groups. The null hypothesis for this test is H
1 2 0 : μ1 − μ2 = 0 .
Markscheme
(H 1 :) μ 1 − μ 2 ≠ 0 ( μ1 ≠ μ2 ) A1
Note: Accept an equivalent statement in words, however reference to “population mean” must be explicit for A1 to be
awarded.
[1 mark]
Markscheme
Note: Award A1 for an answer of 0. 0815486 … from not using a pooled estimate of the variance.
[2 marks]
(c.i) State the conclusion of the test. Justify your answer. [2]
Markscheme
0. 0778 < 0. 1 R1
[2 marks]
Markscheme
there is (significant evidence of ) a difference between the (population) mean reaction times A1
Note: Their conclusion in (c)(ii) must match their conclusion in (c)(i) to earn A1. Award A0 if their conclusion refers to mean
reaction times in the sample.
[1 mark]
25 33 51 62 63 63 70 74 79 79 81 88 90 90 98
For these data, the lower quartile is 62 and the upper quartile is 88.
(a) Show that the test score of 25 would not be considered an outlier. [3]
Markscheme
62 − 39
23 A1
25 > 23 R1
so is not an outlier AG
[3 marks]
The box and whisker diagram showing these scores is given below.
Test scores
Another mathematics class is run by the college during the evening. A box and whisker diagram showing the scores from this class
for the same test is given below.
Test scores
A researcher reviews the box and whisker diagrams and believes that the evening class performed better than the morning class.
(b) With reference to the box and whisker diagrams, state one aspect that may support the researcher’s opinion and
one aspect that may counter it. [2]
Markscheme
The median score for the evening class is higher than the median score for the morning class. A1
THEN
but the scores are more spread out in the evening class than in the morning class A1
OR
OR
OR
OR
[2 marks]
(a) Find the probability that a randomly chosen applicant from this group was accepted by the university. [1]
Markscheme
(
17+25
130
=)
42
130
(
21
65
, 0. 323076 …) A1
[1 mark]
An applicant is chosen at random from this group. It is found that they were accepted into the programme of their choice.
(b) Find the probability that the applicant applied for the Arts programme. [2]
Markscheme
(
17
17+25
=)
17
42
(0. 404761 …) A1A1
[2 marks]
(c) Two different applicants are chosen at random from the original group.
Find the probability that both applicants applied to the Arts programme. [3]
Markscheme
41
130
×
40
129
A1M1
Note: Award A1 for two correct fractions seen, M1 for multiplying their fractions.
=
1640
16770
≈ 0. 0978 (0. 0977936 … ,
164
1677
) A1
[3 marks]
(a) Calculate the expected number of people who will pass this polygraph test. [2]
Markscheme
(E(X) =) 10 × 0. 8 (M1)
8 (people) A1
[2 marks]
(b) Calculate the probability that exactly 4 people will fail this polygraph test. [2]
Markscheme
[2 marks]
(c) Determine the probability that fewer than 7 people will pass this polygraph test. [3]
Markscheme
[3 marks]
37. [Maximum mark: 5]
The masses of Fuji apples are normally distributed with a mean of 163 g and a standard deviation of 6. 83 g.
When Fuji apples are picked, they are classified as small, medium, large or extra large depending on their mass. Large apples have a
mass of between 172 g and 183 g.
(a) Determine the probability that a Fuji apple selected at random will be a large apple. [2]
Markscheme
sketch of normal curve with shaded region to the right of the mean and correct values (M1)
[2 marks]
Approximately 68% of Fuji apples have a mass within the medium-sized category, which is between k and 172 g.
Markscheme
EITHER
0. 906200 … (A1)
0. 226200 … (A1)
OR
0. 406200 … (A1)
OR
(A1)(A1)
Note: Award A1 for a normal distribution curve with a vertical line on each side of the mean and a correct probability of either
0. 406 or 0. 274 or 0. 906 shown, A1 for a probability of 0. 226 seen.
THEN
[3 marks]
Markscheme
a = 0. 42 A1
[1 mark]
(b) Find an expression, in terms of b, for the probability of a person not having blue eyes and having fair hair. [1]
Markscheme
(P(B′∩F ) =) b × 0. 68 A1
[1 mark]
(c.i) .
b [2]
Markscheme
0. 32 × 0. 58 + 0. 68b = 0. 41 (M1)
b = 0. 33 A1
[2 marks]
(c.ii) c . [1]
Markscheme
c = 0. 67 A1
[1 mark]
To test this, he recorded the age, x years, and the time, t minutes, for eight males in a single 5000 m race. His results are presented
in the following table and scatter diagram.
(a) For this data, find the value of the Pearson’s product-moment correlation coefficient, r. [2]
Markscheme
[2 marks]
Eduardo looked in a sports science text book. He found that the following information about r was appropriate for athletic
performance.
(b) Comment on your answer to part (a), using the information that Eduardo found. [1]
Markscheme
strong A1
Note: Answer may include “positive”, however this is not necessary for the mark.
[1 mark]
(c) Write down the equation of the regression line of t on x, in the form t = ax + b. [1]
Markscheme
[1 mark]
Use the equation of the regression line to estimate the time he took to complete the 5000 m race. [2]
Markscheme
Note: Award (M1) for correct substitution into their regression line.
Note: Accept 37. 1 and 37. 4 from use of 2sf and/or 3sf values.
[2 marks]
Markscheme
75 A1
[1 mark]
The students were awarded a grade from 1 to 5, depending on the score obtained in the exam. The number of students receiving
each grade is shown in the following table.
Markscheme
a = 120 − 6 − 13 − 26 − b OR a = 75 − b A1
[2 marks]
Markscheme
6×1+13×2+26×3+(75−b)×4+b×5
120
= 3. 65 (M1)(A1)
Note: Award (M1) for attempt to substitute into mean formula, LHS expression is sufficient for the M mark. Award (A1) for correct
substitutions in one variable OR in two variables, followed by evidence of solving simultaneously with a + b = 75.
(b =) 28 A1
[3 marks]
Markscheme
84 A1
[2 marks]
She recorded the weights of eggs, in grams, from a random selection of geese. The data is shown in the table.
In order to test her claim, Arriane performs a t-test at a 10% level of significance. It is assumed that the weights of eggs are normally
distributed and the samples have equal variances.
Markscheme
EITHER
The population mean weight of eggs from (her/the) black geese is equal to/the same as the population mean weight of
H0 :
OR
H0 :The population mean weight of eggs from (her/the) black geese is not less than the population mean weight of eggs
from (her/the) white geese. A1
Note: Reference to the "population mean weight" must be explicit for the A1 to be awarded. The term “population” can be
implied by use of “all” or “on average” or “generally” when relating to the weight of eggs e.g. “the mean weight of eggs for all
(her/the) black geese”.
Award A0 if reference is made to the mean weights from the sample or the table.
Award A0 for a null hypothesis written in symbolic form.
[1 mark]
Markscheme
[2 marks]
(c) State whether the result of the test supports Arriane’s claim. Justify your reasoning. [2]
Markscheme
0. 177 > 0. 1 R1
Note: Accept p > 0. 1 or p > significance level provided p is explicitly seen in part (b). Award A1 only if reference is specifically
made to Arriane's claim.
Do not award R0A1.
[2 marks]
[2]
Markscheme
A2
[2 marks]
Markscheme
32
36
(
8
9
, 0. 888888 … , 88. 9%) (A1)
[1 mark]
Markscheme
32
(0. 34375, 34. 4%) A1
[2 marks]
Markscheme
1×1+3×2+5×3+…+11×6
36
(M1)
=
161
36
(4
17
36
, 4. 47, 4. 47222 …) A1
[2 marks]
Markscheme
14 A1
[1 mark]
Markscheme
14+15+…
10
(M1)
= 13. 1 A1
[2 marks]
Markscheme
2. 21 (2. 21133 …) A1
[1 mark]
To test the model, they record the number of copies sold each weekday during a particular week. This data is shown in the table.
A goodness of fit test at the 5% significance level is used on this data to determine whether the vendor’s model is suitable.
The critical value for the test is 9. 49 and the hypotheses are
(a) Find an estimate for how many copies the vendor expects to sell each day. [1]
Markscheme
(
74+97+91+86+112
5
) = 92 A1
[1 mark]
(b.i) Write down the degrees of freedom for this test. [1]
Markscheme
4 A1
[1 mark]
(b.ii) Write down the conclusion to the test. Give a reason for your answer. [4]
Markscheme
χ
2
calc
= 8. 54 (8. 54347 …) OR p-value = 0. 0736 (0. 0735802 …) A2
Note: Do not award R0A1. Accept “accept” or “do not reject” in place of “insufficient evidence to reject”.
Award the R1 for comparing their p-value with 0. 05 or their χ value with 9. 49 and then FT their final conclusion.
2
[4 marks]
Markscheme
(let μ c = population mean for chinchilla rabbits, μ s = population mean for sable rabbits)
H0 : μc = μs A1
H1 : μc > μs A1
Note: Accept an equivalent statement in words, must include mean and reference to “population mean” / “mean for all
chinchilla rabbits” for the first A1 to be awarded.
Do not accept an imprecise “the means are equal”.
[2 marks]
Markscheme
[2 marks]
(c) Write down the conclusion to the test. Give a reason for your answer. [2]
Markscheme
0. 0408 < 0. 05 . R1
(there is sufficient evidence to suggest that chinchilla rabbits are heavier than sable rabbits)
Note: Do not award R0A1. Accept ‘accept H ’. 1
[2 marks]
(a.i) the minimum number of sick days taken during the year. [1]
Markscheme
2 A1
[1 mark]
Markscheme
6 A1
[1 mark]
Markscheme
8 A1
[1 mark]
(b) Paul claims that this box and whisker diagram can be used to infer that the percentage of employees who took
fewer than six sick days is smaller than the percentage of employees who took more than eleven sick days.
EITHER
OR
The diagram is not explicit enough to show what is happening at the quartiles regarding 6 and 11 / we do not have the data
points R1
OR
THEN
[2 marks]
They took a random sample of six typical apartments along a train line in the city. Xavie obtained the data shown in the following
table.
Markscheme
r s = −1 A1
[1 mark]
Markscheme
[2 marks]
(b.ii) Use your value of r to state which two of the following would best describe the correlation between the variables.
[2]
Markscheme
Due to the demand of the question, do not accept “negative (from the graph)” if their r value is positive.
[2 marks]
The relationship between the variables can be modelled by the regression equation y = ax + b .
(c.i) Write down the value of a. [1]
Markscheme
[1 mark]
Markscheme
b = 3. 19 (b = 3. 19150 …) A1
[1 mark]
(c.iii) According to this model, state in context what the value of b represents. [1]
Markscheme
b represents the (typical) price of an apartment in the centre (of the city) A1
Note: To award the A1, some reference to “centre” or “zero distance from the city” needs to be seen.
[1 mark]
(d) Xavie uses the regression equation to estimate the price of a typical apartment located 19. 6 km from the city
centre.
Markscheme
= 1. 25 (1. 24704 …) A1
[3 marks]
(d.ii) State two reasons that Xavie might use to justify the validity of this estimate. [2]
Markscheme
interpolation R1
strong correlation. R1
[2 marks]
To verify whether this relationship applies in a different direction from the city centre, Xavie considers two locations, A and B, both
an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices
(in millions of dollars) in the following tables.
Xavie conducts a t-test, at the 5 % level of significance, to see if the mean apartment price in location A is different to the mean
apartment price in location B. They assume the population variances are the same.
Markscheme
μA ≠ μB A1
[1 mark]
Markscheme
Award A1 for an answer of p = 0. 0265 (0. 0265017 …) , from use of unpooled GDC settings.
[2 marks]
(g) State the conclusion of the test. Justify your answer. [2]
Markscheme
0. 0223977 … < 0. 05 R1
[2 marks]
(h) State one additional assumption Xavie has made about the distributions to conduct this test. [1]
Markscheme
(the two populations are) normally distributed A1
Note: Do not accept “independent” as that applies to the samples, not the populations.
[1 mark]
The manufacturer claims the probability of switch A failing within one month of being fitted is 0. 1 and the probability of the
cheaper switch B failing within one month is 0. 3. Whether or not a switch fails is independent of the state of the other switch.
If both switches fail, the generator needs to shut down to replace the switches. Both switches are replaced after a month of use
(whether they have failed or not) or whenever the generator needs to be shut down.
The following tree diagram shows the probabilities of a switch failing within one month of them both being replaced, assuming
the manufacturer’s claim is correct.
Markscheme
,
a = 0. 9 b = 0. 3 and c = 0. 7 A2
[2 marks]
(b) Hence find the probability that the generator needs to shut down within one month of the switches being
replaced. [1]
Markscheme
(0. 1 × 0. 3 =) 0. 03 A1
[1 mark]
The owner of the generator is suspicious of the switch manufacturer’s claims, so they look back through the past 200 occasions
when the switches were replaced. The records show whether no switches, one switch or two switches had failed.
The data the owner collected are shown in the following table.
(c) Show that the expected value of no switches failing in the generator, during the last 200 occasions when the
switches were replaced, is 126. [2]
Markscheme
P (no fail)= 0. 63 A1
multiplying by 200 M1
= 126 AG
Note: Award A0M0 for a flawed approach to find P(no fail)= 0. 63 , e.g. 126
200
= 0. 63 , which is reverse engineering.
[2 marks]
(d) Perform a χ goodness of fit test at the 5 % significance level to test whether the manufacturer’s claims are correct
2
Markscheme
EITHER
OR
THEN
(A1)
Note: Award A1 for df = 2 seen anywhere and may be awarded independent of the M1 mark.
The df cannot be implied from chi sq statistic = 3. 40989
0. 182 > 0. 05 R1
hence insufficient evidence to reject H (that the manufacturers claims are correct)
0 A1
Note: The R1A1 can be awarded as follow through within part (d) from their (explicitly labelled) incorrect p-value.
Accept either a conclusion to not reject the null hypothesis or the manufacturers claims are correct.
[7 marks]
Markscheme
A3
Note: Award A1 for 1 in correct place, A1 for 3, 2 and 4 correct, A1 for 35, 20 and 5 correct. Award at most A0A1A1 if the rectangle
is omitted. Condone the omission of the 430, as explicitly asked for in part (b).
[3 marks]
(b) Calculate the number of students who did not have headlice or influenza or chickenpox. [2]
Markscheme
35 + 4 + 20 + 3 + 1 + 2 + 5 (M1)
430 A1
[2 marks]
Markscheme
42
500
(
21
250
, 0. 084, 8. 4 %) A2
[2 marks]
Markscheme
42
5
(0. 119047 … , 11. 9 %) A2
The first A1 can be awarded for an attempt to use conditional probability with 0. 084 on the denominator.
[2 marks]
Diego is a teacher in the school. He believes that the number of students, n, who have had influenza during the first t days of the
school year, can be modelled by the function
.
kt
n(t) = 250 − 240(2) , k ∈ R
(d) Use Diego’s model to calculate the number of students who started the school year with influenza. [2]
Markscheme
10 A1
[2 marks]
It is known that 130 students have had influenza during the first 10 days of the school year.
Markscheme
k = −0. 1 A1
[2 marks]
(f ) Using this model, calculate how many days it will take for 200 students to have had influenza since the start of the
school year. [2]
Markscheme
(M1)
−0.1t
200 = 250 − 240 (2)
[2 marks]
By the last day of the school year, it is known that 300 students have had influenza.
Markscheme
EITHER
model does not predict n to go above 250 / reach 300 A1
OR
250 − 240 × 2
−0.1×365
= 250 so does not reach 300 A1
OR
there is no solution to n(t) = 300 A1
OR
correct sketch graph, with 250 and/or 300 labelled, and a supporting comment A1
THEN
hence Diego’s model is not appropriate.
Note: Do not credit reasoning based on selecting arbitrary high values of t and finding the associated n value.
[1 mark]
Markscheme
P (X < 160) OR labelled sketch of region OR calc syntax with correct bounds (M1)
Note: Accept either zero or a large negative value as the lower bound.
[2 marks]
Markscheme
P (160 < X < 170) OR labelled sketch of region OR calc syntax with correct bounds (M1)
Note: Award A0A2 for answers of 0. 045 and 0. 41 both given to 2 sf.
[2 marks]
[2]
Markscheme
P (X > h) = 0. 27 OR labelled sketch of region OR calc syntax with correct bounds (M1)
[2 marks]
Janneke selects a random sample of 200 Dutch women from Amsterdam and measures their heights. She wants to determine
whether this sample could have been chosen from a normally distributed population with mean of 170. 7 cm and standard
deviation of 6. 3 cm.
She performs a χ goodness of fit test at the 5 % significance level. She begins by creating the following frequency table.
2
(c.i) a . [1]
Markscheme
82. 21 A1
[1 mark]
(c.ii) b. [1]
Markscheme
94. 86 A1
Note: Follow through from an incorrect part (c)(i) if fourth value is found by subtracting first three values from 200. Award at
most A0A1 if both answers are not given to four significant figures.
[1 mark]
H0 : the heights are drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm
H1 : the heights are not drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm
(d) Write down the degrees of freedom for this test. [1]
Markscheme
3 A1
[1 mark]
(e) Perform the χ goodness of fit test and state your conclusion, justifying your reasoning.
2
[4]
Markscheme
p-value = 0. 616 (0. 615583 …) OR χ
2
= 1. 80 (1. 79702 …) A2
EITHER
fail to reject the null hypothesis A1
OR
the heights are normally distributed with mean 170. 7 cm and standard deviation 6. 3 cm A1
The R1A1 can be awarded as follow through within part (e) from their (explicitly labelled) p-value or
χ -value. Accept comparison in words.
2
[4 marks]
Gundega claims that, on average, Latvian women are taller than Dutch women.
Random samples of 10 Latvian women and 10 Dutch women are chosen, and their heights are measured.
Gundega performs a t-test at the 5 % significance level. It is assumed that the populations are normally distributed and have equal
variances.
(f ) Write down the null and alternative hypotheses for this test. [2]
Markscheme
EITHER
H0 : μL = μD A1
H1 : μL > μD A1
OR
H0 : The (population) mean height of Latvian women is equal to the (population) mean height of Dutch women A1
H1 : The (population) mean height of Latvian women is greater than the (population) mean height of Dutch women A1
Note: Award at most A0A1 if the hypotheses explicitly refer to the “sample” and not the population. For H 0 : m1 = m2 and
H : m > m award A0A1.
1 1 2
[2 marks]
(g) Perform the t-test and state the conclusion, justifying your reasoning. [4]
Markscheme
Note: In this question the p-value is the same 3 sf value for unpooled GDC settings so will be awarded A2.
If using a two-tailed test, the answer is p-value= 0. 654 (0. 653589 …); award A1 if alternative hypothesis was correct or A2
if it follows through correctly from their alternative hypothesis (i.e. two-tailed test was penalized in part (f )).
0. 673205 > 0. 05 R1
fail to reject the null hypothesis (Gundega is not correct) A1
The R1A1 can be awarded as follow through within part (g) from their (explicitly labelled) p-value. Accept comparison in
words.
[4 marks]
Markscheme
75 (minutes) A1
[1 mark]
(a.ii) Calculate an estimate of the mean running time of the 200 movies. [2]
Markscheme
attempt to substitute values in the mean formula with at least one mid-interval value multiplied by a corresponding
frequency (M1)
[2 marks]
(b) Use the cumulative frequency curve to estimate the interquartile range. [2]
Markscheme
9. 15 OR 84 seen (A1)
Note: These values may be seen in the working for part (c).
[2 marks]
“Star Feud” is a movie in the data set and its running time is 100 minutes.
(c) Use your answer to part (b) to estimate whether “Star Feud’s” running time is an outlier for this data. Justify your
answer. [3]
Markscheme
102. 75 > 100 OR 100 − 91. 5 < 11. 25 OR 100 − 11. 25 < 91. 5 R1
It is believed that the running times of family movies follow a normal distribution with mean 88 minutes and standard deviation
6. 75 minutes.
It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 movies could have plausibly
2
(d) Write down the null and the alternative hypotheses for the test. [2]
Markscheme
Note: Award A1 for each correct hypothesis that includes a reference to normal distribution with a mean of 88 and a standard
deviation of 6. 75 (or variance of 6. 75 ). “Correlation”, “independence”, “association”, and “relationship” are incorrect.
2
Award at most A0A1 for correctly worded hypotheses that include a reference to a normal distribution but omit the
distribution’s parameters in one or both hypotheses. Award A0A1 for correct hypotheses that are reversed.
[2 marks]
Markscheme
2
T ~N (88, 6. 75 )
[4 marks]
(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]
Markscheme
d f = 4 (A1)
(p =) 0. 0166 (= 0. 0166282 …) A1
0. 0166 < 0. 05
(Reject H , There is sufficient evidence to say that) the data has not been drawn from the (N (88,
0
2
6. 75 )) distribution.
A1
The conclusion to part (e)(ii) MUST follow through from their hypotheses seen in part (d); if hypotheses are
incorrect/reversed etc., the answer to part (e)(ii) must reflect this in order for the A1 to be credited.
[4 marks]
(a.i) Write down the mid-interval value of 140 ≤ h < 160 . [1]
Markscheme
150 (cm) A1
[1 mark]
(a.ii) Calculate an estimate of the mean height of the 200 students. [2]
Markscheme
attempt to substitute values in the mean formula with at least one mid-interval value multiplied by a corresponding
frequency (M1)
[2 marks]
This table is used to create the following cumulative frequency graph.
(b) Use the cumulative frequency curve to estimate the interquartile range. [2]
Markscheme
Note: These values may be seen in the working for part (c).
[2 marks]
Laszlo is a student in the data set and his height is 204 cm.
(c) Use your answer to part (b) to estimate whether Laszlo’s height is an outlier for this data. Justify your answer. [3]
Markscheme
205. 5 > 204 OR 204 − 183 < 22. 5 OR 204 − 22. 5 < 183 R1
[3 marks]
It is believed that the heights of university students follow a normal distribution with mean 176 cm and standard deviation
13. 5 cm.
It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 students could have
2
(d) Write down the null and the alternative hypotheses for the test. [2]
Markscheme
Note: Award A1 for each correct hypothesis that includes a reference to normal distribution with a mean of 176 and a
standard deviation of 13. 5 (or variance of 13. 5 ). “Correlation”, “independence”, “association”, and “relationship” are
2
incorrect.
Award at most A0A1 for correctly worded hypotheses that include a reference to a normal distribution but omit the
distribution’s parameters in one or both hypotheses. Award A0A1 for correct hypotheses that are reversed.
[2 marks]
Markscheme
2
h~N (176, 13. 5 )
[4 marks]
(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]
Markscheme
d f = 4 (A1)
(p =) 0. 0166 (= 0. 0166282 …) A1
0. 0166 < 0. 05
Note: Accept p value of 0. 0165 (= 0. 0164693 …) from using a and b to 3 sf.
(Reject H , There is sufficient evidence to say that) the data has not been drawn from the (N (176, 13. 5
0
2
)) distribution.
A1
The conclusion to part (e)(ii) MUST follow through from their hypotheses seen in part (d); if hypotheses are
incorrect/reversed etc., the answer to part (e)(ii) must reflect this in order for the A1 to be credited.
[4 marks]
Year °C (y) 8. 73 9. 22 9. 10 9. 12 9. 13 9. 45 9. 76
Tami creates a linear model for this data by finding the equation of the straight line passing through the points with coordinates
(1708, 8. 73) and (1958, 9. 45).
(a) Calculate the gradient of the straight line that passes through these two points. [2]
Markscheme
9.45−8.73
1958−1708
(M1)
= 0. 00288 (
9
3125
) A1
[2 marks]
(b.i) Interpret the meaning of the gradient in the context of the question. [1]
Markscheme
[1 mark]
Markscheme
°C / year OR degrees C per year A1
[1 mark]
(c) Find the equation of this line giving your answer in the form y = mx + c . [2]
Markscheme
or
equation is y = 0. 00288x + 3. 81 A1
[2 marks]
(d) Use Tami’s model to estimate the mean annual temperature in the year 2000. [2]
Markscheme
[2 marks]
Markscheme
Note: Award (M1)A0 for answers that show the correct method, but are presented incorrectly (e.g. no “y = ” or truncated values
etc.). Accept 4. 465 as the correct answer to 4 sf.
[2 marks]
(e.ii) Find the value of r, the Pearson’s product-moment correlation coefficient. [1]
Markscheme
[1 mark]
(f ) Use Thandizo’s model to estimate the mean annual temperature in the year 2000. [2]
Markscheme
[2 marks]
Thandizo uses his regression line to predict the year when the mean annual temperature will first exceed 15 °C.
(g) State two reasons why Thandizo’s prediction may not be valid. [2]
Markscheme
cannot (always reliably) make a prediction of x from a value of y, when using a y on x line / regression line is not x on y A1
extrapolation A1
[2 marks]
This remedy is to be used by 115 patients, and it is assumed that the 82% claim is true.
(a) Find the probability that exactly 90 of these patients will be cured. [3]
Markscheme
Note: Award (M1)A1A0 for an answer of 0. 054 with or without working shown.
[3 marks]
(b) Find the probability that at least 95 of these patients will be cured. [2]
Markscheme
[2 marks]
(c) Find the variance in the possible number of patients that will be cured. [2]
Markscheme
115 × 0. 82 × 0. 18
[2 marks]
The probability that at least n patients will be cured is less than 30%.
Markscheme
METHOD 1
including 0. 3 or 0. 7 (M1)
n = 98 A1
METHOD 2
using binomcdf in GDC for at least two different values of n greater than 90 (M1)
EITHER
(P(X < 97) =) 0. 696683 … AND (P(X < 98) =) 0. 778249 … (seen) (A1)
OR
(P(X > 97) =) 0. 303316 … AND (P(X > 98) =) 0. 221750 … (seen) (A1)
THEN
n = 98 A1
[3 marks]
A clinic is interested to see if the mean recovery time of their patients who tried the new remedy is less than that of their patients
who continued with an older remedy. The clinic randomly selects some of their patients and records their recovery time in days.
The results are shown in the table below.
The data is assumed to follow a normal distribution and the population variance is the same for the two groups. A t-test is used to
compare the means of the two groups at the 10% significance level.
(e) State the appropriate null and alternative hypotheses for this t-test. [2]
Markscheme
H 0 : μ 1 = μ 2 (H 0 : μ 1 − μ 2 = 0) A1
H 1 : μ 1 < μ 2 (H 1 : μ 1 − μ 2 < 0) A1
Note: Accept an equivalent statement in words, must include mean and reference to “population mean”, e.g. “mean for all
patients on old remedy”, for the first A1 to be awarded.
[2 marks]
Markscheme
Note: Allow 0. 062 as final answer. Award A1 for an answer of 0. 06. Award A1 for an answer of 0. 0527756 … from use of
unpooled setting.
Follow through from an incorrect alternative hypothesis as long as their p-value matches their alternative hypothesis.
[2 marks]
(g) State the conclusion for this test. Give a reason for your answer. [2]
Markscheme
0. 0620 < 0. 1 R1
Note: Do not award R0A1. Accept “p-value is less than 0. 1” provided an answer was seen in part (f ).
[2 marks]
Markscheme
the probability of obtaining results (at least as extreme) as those observed given that the null hypothesis is true A1
[1 mark]
Markscheme
continuous A1
[1 mark]
Elsie’s data for 160 people who visited the library on that particular day is shown in the following table.
Markscheme
160 − 50 − 62 − 14 − 8 (M1)
(k =) 26 A1
[2 marks]
Markscheme
20 ≤ T < 40 A1
[1 mark]
(c.ii) Write down the mid-interval value for this class. [1]
Markscheme
30 A1
[1 mark]
(d) Use Elsie’s data to calculate an estimate of the mean time that people spent in the library. [2]
Markscheme
33. 5 minutes A2
Note: FT from their value of k and their mid-interval value. Follow through from part (c)(ii) but only if mid-interval value lies in
their interval.
[2 marks]
(e) Using the table, write down the maximum possible number of people who spent 35 minutes or less in the library
on that day. [1]
Markscheme
112 A1
[1 mark]
(f ) Find the probability a visitor spends at least 60 minutes in the library. [2]
Markscheme
22
160
[0. 138, 0. 1375, 13. 75%,
11
80
] A1A1
[2 marks]
The following box and whisker diagram shows the times, in minutes, that the 160 visitors spent in the library.
(g) Write down the median time spent in the library. [1]
Markscheme
26 minutes A1
[1 mark]
50 − 16 (M1)
34 minutes A1
[2 marks]
(i) Hence show that the longest time that a person spent in the library is not an outlier. [3]
Markscheme
50 + 1. 5 × 34
= 101 A1
not an outlier AG
Note: Award R1 for their correct comparison. Follow through from their part (h). Award R0 if their conclusion is “it is an outlier”,
this contradicts Elsie’s belief.
[3 marks]
Elsie believes the box and whisker diagram indicates that the times spent in the library are not normally distributed.
(j) Identify one feature of the box and whisker diagram which might support Elsie’s belief. [1]
Markscheme
EITHER
OR
[1 mark]
56. [Maximum mark: 16]
At Mirabooka Primary School, a survey found that 68% of students have a dog and 36% of students have a cat. 14% of students
have both a dog and a cat.
This information can be represented in the following Venn diagram, where m, n, p and q represent the percentage of students
within each region.
(a.i) m . [1]
Markscheme
(m =) 54% A1
Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.
[1 mark]
(a.ii) n . [1]
Markscheme
(n =) 14% A1
Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.
[1 mark]
(a.iii) .
p [1]
Markscheme
(p =) 22% A1
Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.
[1 mark]
(a.iv) q . [1]
Markscheme
(q =) 10% A1
Note: Based on their n, follow through for parts (i) and (iii), but only if it does not contradict the given information. Follow
through for part (iv) but only if the total is 100%.
[1 mark]
(b) Find the percentage of students who have a dog or a cat or both. [1]
Markscheme
90 (%) A1
[1 mark]
Markscheme
0. 54 (
54
100
,
27
50
, 54%) A1
[1 mark]
(c.ii) has a dog given that they do not have a cat. [2]
Markscheme
54
64
(0. 844,
27
32
, 84. 4%, 0. 84375) A1A1
Note: Award A1 for a correct denominator (0. 64 or 64 seen), A1 for the correct final answer.
[2 marks]
Each year, one student is chosen randomly to be the school captain of Mirabooka Primary School.
Tim is using a binomial distribution to make predictions about how many of the next 10 school captains will own a dog. He
assumes that the percentages found in the survey will remain constant for future years and that the events “being a school captain”
and “having a dog” are independent.
Use Tim’s model to find the probability that in the next 10 years
Markscheme
X~B(10, 0. 68)
[2 marks]
Markscheme
[2 marks]
Markscheme
(M1)
9
(0. 68) × 0. 32
9
2 × ((0. 68) × 0. 32)
(e) State why John should not use the binomial distribution to find the probability that 5 of these students have a
dog. [1]
Markscheme
EITHER
OR
OR
[1 mark]
Markscheme
370+472
2
(M1)
Note: This (M1) can also be awarded for either a correct Q or a correct Q in part (a)(ii).
3 1
Q 3 = 421 A1
[2 marks]
Markscheme
[2 marks]
(b) Determine if the Netherlands’ score is an outlier for this data. Justify your answer. [3]
Markscheme
(Q 3
+ 1. 5 (IQR) =) 421 + (1. 5 × 103) (M1)
= 575. 5
[3 marks]
Chester is investigating the relationship between the highest-scoring countries’ Eurovision score and their population size to
determine whether population size can reasonably be used to predict a country’s score.
The populations of the countries, to the nearest million, are shown in the table.
Chester finds that, for this data, the Pearson’s product moment correlation coefficient is r = 0. 249.
(c) State whether it would be appropriate for Chester to use the equation of a regression line for y on x to predict a
country’s Eurovision score. Justify your answer. [2]
Markscheme
[2 marks]
Chester then decides to find the Spearman’s rank correlation coefficient for this data, and creates a table of ranks.
(d.i) a . [1]
Markscheme
6 A1
[1 mark]
(d.ii) .
b [1]
Markscheme
4. 5 A1
[1 mark]
(d.iii) c . [1]
Markscheme
4. 5 A1
[1 mark]
Markscheme
[2 marks]
Markscheme
EITHER
there is a (positive) association between the population size and the score A1
OR
there is a (positive) linear correlation between the ranks of the population size and the ranks of the scores (when compared
with the PMCC of 0. 249). A1
[1 mark]
(f ) When calculating the ranks, Chester incorrectly read the Netherlands’ score as 478. Explain why the value of the
Spearman’s rank correlation r does not change despite this error.
s [1]
Markscheme
lowering the top score by 20 does not change its rank so r is unchanged
s R1
Note: Accept “this would not alter the rank” or “Netherlands still top rank” or similar. Condone any statement that clearly
implies the ranks have not changed, for example: “The Netherlands still has the highest score.”
[1 mark]
The number of passengers that arrive to board this flight is assumed to follow a binomial distribution with a probability of 0. 9.
(a) The airline sells 74 tickets for this flight. Find the probability that more than 72 passengers arrive to board the
flight. [3]
Markscheme
T ~B(74, 0. 9) OR n = 74 (M1)
Note: Using the distribution B(74, 0. 1), to work with the 10% that do not arrive for the flight, here and throughout this
question, is a valid approach.
[3 marks]
(b.i) Write down the expected number of passengers who will arrive to board the flight if 72 tickets are sold. [2]
Markscheme
72 × 0. 9 (M1)
64. 8 A1
[2 marks]
(b.ii) Find the maximum number of tickets that could be sold if the expected number of passengers who arrive to board
the flight must be less than or equal to 72. [2]
Markscheme
n × 0. 9 = 72 (M1)
80 A1
[2 marks]
Each passenger pays $150 for a ticket. If too many passengers arrive, then the airline will give $300 in compensation to each
passenger that cannot board.
(c) Find, to the nearest integer, the expected increase or decrease in the money made by the airline if they decide to
sell 74 tickets rather than 72. [8]
Markscheme
METHOD 1
EITHER
Note: Award A1A1 for each row correct. Award A1 for one correct entry and A1 for the remaining entries correct.
OR
expected compensation is
= 11098. 73 … (= $11099) A1
THEN
METHOD 2
METHOD 3
let D be the change in income when selling 74 tickets.
(A1)(A1)
Note: Award A1 for one error, however award A1A1 if there is no explicit mention that T = 73 would result in D = 0 and the
other two are correct.
= $299 A1
[8 marks]
Markscheme
0. 58 (s) A1
[1 mark]
Markscheme
0. 7 − 0. 42 (A1)(M1)
Note: Award A1 for correct quartiles seen, M1 for subtraction of their quartiles.
0. 28 (s) A1
[3 marks]
(b) Find the estimated number of teenagers who have a reaction time greater than 0. 4 seconds. [2]
Markscheme
[2 marks]
(c) Determine the 90th percentile of the reaction times from the cumulative frequency graph. [2]
Markscheme
(90% × 40 =) 36 OR 4 (A1)
0. 8 s A1
[2 marks]
Mackenzie created the cumulative frequency graph using the following grouped frequency table.
(a =) 6 A1
[1 mark]
Markscheme
(b =) 4 A1
[1 mark]
(e) Write down the modal class from the table. [1]
Markscheme
0. 6 < t ≤ 0. 8 A1
[1 mark]
(f ) Use your graphic display calculator to find an estimate of the mean reaction time. [2]
Markscheme
0. 55 s A2
[2 marks]
Upon completion of the experiment, Mackenzie realized that some values were grouped incorrectly in the frequency table. Some
reaction times recorded in the interval 0 < t ≤ 0. 2 should have been recorded in the interval 0. 2 < t ≤ 0. 4.
(g) Suggest how, if at all, the estimated mean and estimated median reaction times will change if the errors are
corrected. Justify your response. [4]
Markscheme
because the incorrect reaction times are moving from a lower interval to a higher interval which will increase the numerator
of the mean calculation R1
[4 marks]
A student from the group is chosen at random. Calculate the probability that the student
Markscheme
560
1280
(
7
16
, 0. 4375) A1A1
[2 marks]
Markscheme
1280
72
(
9
160
, 0. 05625) A1A1
[2 marks]
(a.iii) prefers a laptop given that they are 17–18 years old. [2]
Markscheme
153
348
(
51
116
, 0. 439655 …) A1A1
[2 marks]
Markscheme
848
1280
(
53
80
, 0. 6625) A1A1
Note: Award A1 for correct denominator (1280) seen, (M1) for correct calculation of the numerator, A1 for the correct answer.
[3 marks]
A χ test for independence was performed on the collected data at the 1% significance level. The critical value for the test is
2
13. 277.
Markscheme
Note: Award A1 for for both hypotheses correct. Do not accept “not correlated” or “not related” in place of “independent”.
[1 mark]
Markscheme
4 A1
[1 mark]
(χ
2
=) 23. 3 (23. 3258 …) A2
[2 marks]
Markscheme
[1 mark]
(d.iii) State the conclusion for the test in context. Give a reason for your answer. [2]
Markscheme
EITHER
OR
0. 000109 < 0. 01 R1
THEN
(there is sufficient evidence to accept H that) preferred device and age group are not independent
1 A1
Note: For the final A1 the answer must be in context. Do not award A1R0.
[2 marks]
Markscheme
2
X~N(10, 3 )
[2 marks]
Markscheme
[1 mark]
(b) Find the probability that Arianne throws two consecutive darts that land more than 15 cm from O. [2]
Markscheme
[2 marks]
In a competition a player has three darts to throw on each turn. A point is scored if a player throws all three darts to land within a
central area around O. When Arianne throws a dart the probability that it lands within this area is 0. 8143.
(c) Find the probability that Arianne does not score a point on a turn of three darts. [2]
Markscheme
(M1)
3
1 − (0. 8143)
[2 marks]
In the competition Arianne has ten turns, each with three darts.
(d.i) Find the probability that Arianne scores at least 5 points in the competition. [3]
Markscheme
METHOD 1
METHOD 2
[3 marks]
(d.ii) Find the probability that Arianne scores at least 5 points and less than 8 points. [2]
Markscheme
Note: Award M1 for a correct probability statement or indication of correct lower and upper bounds, 5 and 7.
[2 marks]
(d.iii) Given that Arianne scores at least 5 points, find the probability that Arianne scores less than 8 points. [2]
Markscheme
P(5≤Y <8)
P(Y ≥5)
(=
0.627788…
0.716650…
) (M1)
[2 marks]
Under this assumption, find, correct to four decimal places, the probability that a bicycle chosen at random travelling at 20 km h −1
manages to stop
Markscheme
0. 0151 A1
[2 marks]
Markscheme
0. 0228 A1
[1 mark]
1000 randomly selected bicycles are tested and their stopping distances when travelling at 20 km h −1
are measured.
Find, correct to four significant figures, the expected number of bicycles tested that stop between
(b.i) 6. 5 m and 6. 75 m. [2]
Markscheme
451. 7 A1
[2 marks]
Markscheme
510. 5 A1
[1 mark]
The measured stopping distances of the 1000 bicycles are given in the table.
It is decided to perform a χ goodness of fit test at the 5% level of significance to decide whether the stopping distances of
2
bicycles travelling at 20 km h −1
can be modelled by a normal distribution with mean 6. 76 m and standard deviation 0. 12 m.
Markscheme
Note: Award A1 for correct H , including reference to the mean and standard deviation. Award A1 for the negation of their H .
0 0
[2 marks]
Markscheme
(e) State the conclusion of the test. Give a reason for your answer. [2]
Markscheme
0. 05 < 0. 0727 R1
[2 marks]
(a) State which of the two sampling methods, systematic or quota, Jason has used. [1]
Markscheme
Quota sampling A1
[1 mark]
Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this
book.
(b) Write down the median time to read the book. [1]
Markscheme
10 (hours) A1
[1 mark]
Markscheme
15 − 7 (M1)
8 A1
[2 marks]
Mackenzie, a member of the sample, took 25 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.
Markscheme
15 + 1. 5 × 8
27 A1
Jason is correct A1
Note: Do not award R0A1. Follow through within this part from their 27, but only if their value is supported by a valid attempt
or clearly and correctly explains what their value represents.
[4 marks]
For each student interviewed, Jason recorded the time taken to read The Old Man and the Sea (x), measured in hours, and paired this with
their percentage score on the final exam (y). These data are represented on the scatter diagram.
“negative” seen A1
[1 mark]
Jason correctly calculates the equation of the regression line y on x for these students to be
He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1. 5 hours.
Markscheme
y = −1. 54 × 1. 5 + 98. 8
[2 marks]
(g) State whether it is valid to use the regression line y on x for Jason’s estimate. Give a reason for your answer. [2]
Markscheme
not reliable A1
Note: Do not award A1R0. Only accept reasoning that includes reference to the range of the data. Do not accept a contextual
reason such as 1. 5 hours is too short to read the book.
[2 marks]
Jason found a website that rated the ‘top 50’ classic books. He randomly chose eight of these classic books and recorded the
number of pages. For example, Book H is rated 44th and has 281 pages. These data are shown in the table.
Jason intends to analyse the data using Spearman’s rank correlation coefficient, r . s
(h) Copy and complete the information in the following table.
[2]
Markscheme
A1A1
Note: Do not award A1 for correct ranks for ‘number of pages’. Award A1 for correct ranks for ‘top 50 rating’.
[2 marks]
Markscheme
[2 marks]
Markscheme
EITHER
there is a (strong/moderate) positive association between the number of pages and the top 50 rating. A1
OR
there is a (strong/moderate) agreement between the rank order of number of pages and the rank order top 50 rating. A1
OR
there is a (strong/moderate) positive (linear) correlation between the rank order of number of pages and the rank order top
50 rating. A1
Markscheme
A1A1
Note: Award A1 for a normal curve with mean labelled 6. 1 or μ, A1 for indication of SD (0. 5) : marks on horizontal axis at 5. 6
and/or 6. 6 OR μ − 0. 5 and/or μ + 0. 5 on the correct side and approximately correct position.
[2 marks]
(b) Find the proportion of male Persian cats weighing between 5. 5 kg and 6. 5 kg. [2]
Markscheme
2
X~N(6. 1, 0. 5 )
[2 marks]
(c) Determine the expected number of cats in this group that have a weight of less than 5. 3 kg. [3]
Markscheme
0. 0547992 … × 80 (M1)
= 4. 38 (4. 38393 …) A1
[3 marks]
(d) It is found that 12 of the cats weigh more than x kg. Estimate the value of x. [3]
Markscheme
0. 15 OR 0. 85 (A1)
6. 62 (6. 61821 …) A1
[3 marks]
(e) Ten of the cats are chosen at random. Find the probability that exactly one of them weighs over 6. 25 kg. [4]
Markscheme
[4 marks]
They test every patient who comes to the centre on a particular day.
Markscheme
[1 mark]
It is intended that if a patient has the disease, they test “positive”, and if a patient does not have the disease, they test “negative”.
However, the tests are not perfect, and only 99% of people who have the disease test positive. Also, 2% of people who do not
have the disease test positive.
(b.i) a . [1]
Markscheme
95% A1
[1 mark]
(b.ii) b. [1]
Markscheme
1% A1
[1 mark]
(b.iii) c . [1]
Markscheme
2% A1
[1 mark]
(b.iv) d . [1]
Markscheme
98% A1
[1 mark]
Use the tree diagram to find the probability that a patient selected at random
(c.i) will not have the disease and will test positive. [2]
Markscheme
0. 95 × 0. 02 (M1)
0. 019 A1
[2 marks]
Markscheme
0. 05 × 0. 01 + 0. 95 × 0. 98 (M1)(M1)
Note: Award M1 for summing two products and M1 for correct products seen.
[3 marks]
(c.iii) has the disease given that they tested negative. [3]
Markscheme
0.05×0.01
0.05×0.01+0.95×0.98
A1
[3 marks]
(d) The medical centre finds the actual number of positive results in their sample is different than predicted by the
tree diagram. Explain why this might be the case. [1]
Markscheme
EITHER
sample may not be representative of population A1
OR
sample is not randomly selected A1
OR
unrealistic to think expected and observed values will be exactly equal A1
[1 mark]
The staff at the medical centre looked at the care received by all visiting patients on a randomly chosen day. All the patients
received at least one of these services: they had medical tests (M ), were seen by a nurse (N ), or were seen by a doctor (D). It was
found that:
18 had medical tests and were seen by a doctor but were not seen by a nurse;
11 patients were seen by a nurse and had medical tests but were not seen by a doctor;
2 patients were seen by a doctor without being seen by nurse and without having medical tests.
(e) Draw a Venn diagram to illustrate this information, placing all relevant information on the diagram. [3]
Markscheme
A1A1A1
Note: Award A1 for rectangle and 3 labelled circles and 9 in centre region; A1 for 2, 40, 24 ; A1 for 18, 1, and 11.
[3 marks]
(f ) Find the total number of patients who visited the centre during this day. [2]
Markscheme
18 + 9 + 1 + 11 + 2 + 40 + 24 (M1)
105 A1
Note: Follow through from the entries on their Venn diagram in part (e). Working required for FT.
[2 marks]