analysis-of-data
analysis-of-data
Analysis of Data
April 2018
F
2
1 The World Health Organisation (WHO) collects data and predicts the average life
expectancy for someone born at a particular time in a particular country. The data
below are for 193 countries and refer to the year 2009.
The data have been organised into a grouped frequency table as shown below.
(a) (i) On the grid below, draw a histogram for these data.
40 50 60 70 80 90 L
[4 marks]
(ii) Using your histogram, or otherwise, estimate how many of these countries have a life
expectancy at birth greater than 77 years.
[2 marks]
(b) (i) From the data given in the table above, estimate a value for the median life expectancy.
[3 marks]
(ii) For these data, the WHO gives the mean life expectancy for someone born in 2009 as
68 years. Explain why this differs from your median value found in part (b)(i).
[1 mark]
2 The table shows the winner and the length of the winning jump for the Olympic
long jump between 1948 and 2012
(a) (i) Calculate the mean distance jumped for the men’s long jump winners from 1948 to
2012
[1 mark]
(a) (ii) Calculate the standard deviation of the distances jumped for the men’s long jump
winners from 1948 to 2012
[2 marks]
(a)(iii) An outlier is any value more than two standard deviations from the mean. Using the data
relating to the men’s long jump winners from 1948 to 2012, identify any outliers, showing
calculations to support your reasoning.
[3 marks]
(b) In the men’s long jump, the shortest winning jump was by Ellery Clark in 1896. The
longest winning jump was in 1968 by Bob Beamon, who jumped 40.2% further than
Ellery Clark. How far did Ellery Clark jump?
[3 marks]
(c) The mean distance jumped for the women’s long jump winners from 1948 to 2012 is 6.80
metres and the standard deviation for this period is 0.411
Use these values and your answers from part (a) to compare the winning distances
jumped by men and women in the long jump competition since 1948.
[2 marks]
3 The grouped frequency table below and the histogram opposite show some of the details
for the finishing times of the top 50 runners in the 2013 London Marathon.
(a) (i) Use the information given to complete the table below.
(b) (i) Calculate an estimate for the man finishing time of these 50 runners.
[3 marks]
Frequency
density
(b) (ii) The mean finishing time for all the runners in the London Marathon was 4:32
Estimate how many of these 50 runners finished in less than half the mean finishing
time.
[3 marks]
4 The Great North Run is a half marathon that takes place in the North East of England
every year.
The table shows the times taken, to complete the 2010 race, by the 120 members of the
‘all GNR’ club. These are the runners who have taken part every year since the race
began.
(a) Draw a cumulative frequency diagram on the grid on the opposite page to show the data.
You may use the spare column in the table above for any calculation required.
[4 marks]
Cumulative frequency
Time, t minutes
(c) The times of the ‘all-GNR’ club runners were also recorded in 2005.
The data are shown as a box and whisker diagram below.
To the graph below, add another box and whisker diagram representing the data for
2010.
[3 marks]
(d) Use the box and whisker diagrams to compare the performances of the ‘all-GNR’ club
runners in 2005 and 2010.
Comment 2
5 The table shows the average life expectancy for people born in 2011 in 11 European
countries and 11 African countries.
(a) Show the data on an ordered back-to-back stem and leaf diagram.
[5 marks]
(b) Use one measure of location and one measure of spread to compare average life
expectancy in these European and African countries.
The table gives the mass of the log whilst drying in the garage.
Number of days
since being cut 0 3 7 10 14 17 21
(a) What was the mass of the log on the day it was cut?
[1 mark]
Answer ................................................................... g
The airline needs to conduct an in-flight survey to find out what passengers think about
the food served on the plane.
Interviews will be carried out when all seats are occupied and it is planned to conduct
80 interviews in total.
Method A
Select at random
2 passengers from first class
12 passengers from business class
66 passengers from economy class.
Method B
Choose a simple random sample of 80 different passengers from those on the
plane.
Answer
(ii) Show how the numbers 2, 12 and 66 were calculated for the sample sizes in this case.
[2 marks]
(b) For Method B, describe how random numbers could be used to select the sample.
[3 marks]
(i) the mean time for the 100 metres race winners;
[1 mark]
(ii) the standard deviation of the time for the 100 metres race winners.
[2 marks]
b) An outlier is any value more than two standard deviations from the mean.
Identify any outliers showing calculations to support your reasoning.
[3 marks]
c) The cumulative frequency graph opposite shows the times taken by 23 male runners
in the 100 metres race in a competition organised by a local club.
i) the interquartile range for the time taken by these runners. Give your answer to the
nearest hundredth of a second.
[3 marks]
ii) the number of runners who ran slower than 12.95 seconds.
[1 mark]
[1 mark]
[1 mark]
10 (a) Give a reason why a random sampling method might not be appropriate in this case.
[1 mark]
10 (b) The manager suggests selecting the first 50 tutors from an alphabetical list of the 400.
Answer
10 (c) The assistant manager suggests selecting a sample of 50, stratified by subject.
How many Maths tutors would there be in this sample?
[3 marks]
Answer
Advantage
Disadvantage
Tick a box.
2 – 4 years
5 – 7 years
8 – 10 years
10 – 12 years
Over 12 years
Criticism 1
Criticism 2
END OF QUESTIONS