Chapter 6 Investigating Data
Chapter 6 Investigating Data
Investigating
data
Which capital city in Australia has the highest average
temperature? Does Melbourne have higher rainfall than
Sydney?
To answer these questions, sets of data need to be collected
and then compared by looking at the shape of their displays
or by analysing their measures of location and spread.
N E W C E N T U R Y M AT H S A D V A N C E D
ustralian Curriculum
10 10A
Shutterstock.com/Gordon Bell
for the A
n Chapter outline
n Wordbank
Proficiency strands
9780170194662
PS R
U
U
F
F
PS R
PS
C
C
PS
PS
PS
PS
U
U
F
F
F
F
F
F
R
R
R
R
R
PS R
C
C
C
C
C
C
U
U
F
F
R
PS R
C
C
PS R
U
U
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
SkillCheck
Worksheet
StartUp assignment 5
MAT10SPWK10032
Skillsheet
Statistical measures
13
3C
18
5C
14
2C
15
4C
18
7C
d
23
3C
20
iv the mode
16
15
12
10
Frequency
MAT10SPSS10012
Worksheet
Statistical match-up
9 10 11 12 13 14 15
8
6
4
2
MAT10SPWK10033
0
41 42 43 44 45 46 47
Score
1
2
3
4
5
188
Stem Leaf
0
1
2
0
2
3
4
3
5
6
6
4
4
7
8
7
5
8
8
5
Score
0
1
2
3
4
5
Frequency
2
5
8
4
3
1
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
30
26
19
41 36
16
32
i the median
ii the mean
123rf/Lance Bellers
a Find:
iii the range.
Technology worksheet
A statistical distribution is the way the scores of a data set are arranged, especially when graphed.
When looking at histograms, dot plots and stem-and-leaf plots, an overall pattern can be seen
from the shape of the display.
The shape of a statistical distribution shows how the data is spread and can be seen by drawing a
curve around the graph or display.
A distribution is symmetrical if the data is evenly spread or balanced about the centre.
MAT10SPCT00005
Stem
3
4
5
6
7
8
9
Leaf
0 2
1 8
2 4
0 3
2 4
2 8
3 5
4
9
5
4
4
8
7
9
6 6 7 8 8
5 5 6 7 8 9 9
4 5 5 5 5
8
15
16
17
18 19 20 21 22
Temperatures in April
23
24
Excel worksheet:
Skewness
Technology worksheet
Excel spreadsheet:
Skewness
MAT10SPCT00035
15
A distribution is skewed if most of the data is bunched or clustered at one end of the distribution
and the other end has a tail.
Tail
9780170194662
Stem
0
1
2
3
4
5
6
7
Leaf
3 5
0 6
5 7
0 3
1 1
0 0
3 5
0 2
Tail
8
8
2
1
7
2
9
3
1
5
4
4 8
2 2 5 5
6 6 7 7 9
5
189
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Frequency
A distribution is bimodal if it has two peaks. The higher peak is the mode, while the other peak
indicates another score that has a high frequency.
For example, this frequency histogram has two peaks at 2 and 7 so it is bimodal. The mode,
however, is 7.
Example
6
7
Score
10
11
9 10 11 12 13 14 15
b Stem
10
11
12
13
14
15
16
Leaf
4 5
3 4 4
1 2 2
0 1 5
4 5 6
0 0 1
0 2
9
6 8
5 7 9 9 9
8 8
1
Solution
a i
ii
b i
ii
The shape is positively skewed (tail points towards the higher scores).
15 is an outlier and clustering occurs at 4 and 5.
The shape is symmetrical (the data is balanced about the stem of 13).
There are no outliers but clustering occurs in the 13s.
Exercise 6-01
See Example 1
Frequency
190
9 10 11 12 13
Score
b Stem
2
3
4
5
6
7
8
9
Leaf
4 5 6
1 2 3
0 4 4
4 5 5
0 0 2
3 5 7
1 1 3
0 3 5
9
3
6
8
3
8
5
6
4 5 7 8
8 9
5 6 7 8 9 9
8 9 9
6
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
1 2 3 4 5 6 7
Number of goals scored
Frequency
10 10A
17 18 19 20 21 22 23 24 25 26
Temperature (C)
e Stem
12
13
14
15
16
17
18
19
20
Leaf
0 2
2 4
3 3
0 1
1 1
2 4
0 3
5 8
6 8
f
4
6
4
1
5
5
9
9
7
4
5
6
8
8 8 8
5 5 8 9 9 9
7 8 9 9
7
Frequency
11 12 13 14 15 16 17 18 19 20 21 22 23
Score
2 3 4 5 7 8 9 10
Marks obtained in a Maths quiz
h Stem
5
6
7
8
9
10
11
12
13
Leaf
3 4
0 0
2 4
5 7
3 3
2 4
4
5
5
8
6
6
6 7 8 9
9 9
6
7 8
8 8 8 8
These are the final round scores for players in a golf tournament.
66 70 67 72 75 72 70 74 75 72 74 72 73 71 71 69 70
72 69 75 73 69 75 73 69 69 67 74 72 72 73 71 73 77
a Arrange the data into a frequency table and construct a frequency histogram.
b Are there any outliers?
71
68
71
72
74
72
9780170194662
191
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
The stem-and-leaf plot shows the number of hours that students spend on their computers
during the week.
Stem
0
1
2
3
4
Leaf
1 1
0 1
0 5
0 0
0 0
1
1
5
0
1
2
5
1
1 2 2 2 2 3 3 3 5 6 6 7 7 7 7 9 9
4 4 5 6 8 8 9
8 8
5
Stem
13
14
15
16
17
18
19
20
21
22
23
24
Leaf
8
9
3
0
4
1
1
5
0
4
0
2
4
2
2
6
6
4
4
5
3
4
7
6
4
7
8
8 9
4
Lowest score
(or lower extreme)
192
First quartile
(Q1 or QL)
Second quartile
(Q2 or median)
Third quartile
(Q3 or QU)
Highest score
(or upper extreme)
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
The first quartile Q1, also called the lower quartile QL, is the value that divides the lower 25% of
scores. 1 of the scores lie below Q1.
4
The second quartile Q2 is the value that divides the lower 50% of scores, so it is also the median.
1 of the scores lie below Q .
2
2
The third quartile Q3, also called the upper quartile QU, is the value that divides the lower 75% of
scores from the upper 25% of scores. 3 of the scores lie below Q3, 1 of the scores lie above it.
4
4
Summary
Finding the quartiles of a data set
sort the scores in order, find the median and call it Q2
find the median of the bottom half of the scores and call it Q1 (or QL)
find the median of the top half of scores and call it Q3 (or QU).
Example
Solution
a Arranging the 20 scores in ascending order, we have:
48 54 63 65 68 70 70 72 75 76 79 79 80 82 82 84 85 93 96 97
68 + 70
2
= 69
76 + 79
2
= 77.5
Q1 =
82 + 84
2
= 83
Q2 (median) =
Q3 =
When finding the quartiles, first find the median, then the lower and upper quartiles.
Q1 (lower quartile) 69; Q2 (median) 77.5; Q3 (upper quartile) 83
b Arranging the 11 scores in ascending order, we have:
2
Median
Q2 = 7
Lower quartile
Q1 = 4
10
Upper quartile
Q3= 9
11
13
Lower quartile
11 + 13
Q1 =
2
= 12
9780170194662
14
15
15
16
Median
Q2 = 15
16
18
19
20
23
Upper quartile
18 + 19
Q3 =
2
= 18.5
193
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Worksheet
Interquartile range
MAT10SPWK10034
Video tutorial
Interquartile range
MAT10SPVT10003
Summary
Interquartile range IQR upper quartile lower quartile
Q3 Q1
interquartile range
25%
50%
lower quartile
Q1
25%
median
Q2
upper quartile
Q3
The interquartile range ignores very low or very high scores (outliers), so sometimes it is better
than the range as a measure of spread.
Example
Solution
First arrange the scores in order:
6
12
17
19
21
Lower quartile
Q1 = 19
a Range 72 6
66
22
23
25
26
Median
Q2 = 25
28
28
29
30
31
72
Upper quartile
Q3 = 29
b Interquartile range Q3 Q1
29 19
10
c The interquartile range is the better measure of spread as the outlier of 72 is excluded.
The score of 72 has affected the range, making it very big.
194
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
Example
ustralian Curriculum
10 10A
Stem
4
5
6
7
8
9
Leaf
0 1
2 5
2 8
0 3
3 4
0 3
3
6
3
5
4
4
6
5
7
8
Solution
a There are 14 scores, so the median is between
the 7th and 8th scores.
Median, Q2 4 4 4
2
Q1 is the median of the lower half of scores.
Q1 2.
Q3 is the median of the upper half of scores.
Q3 4.
Q3
Q2
Q1
1
) IQR Q3 Q1
42
2
b There are 24 scores, so the median is between
the 12th and 13th scores.
73 74
73:5
Median, Q2
2
Lower quartile, Q1 56 59 57:5
2
85
86 85:5
Upper quartile, Q3
2
) IQR 85:5 57:5
28
Exercise 6-02
1
Stem
Leaf
Q1
0 1 3
2 5 6 9
2 8
0 3 3 4 7 9
3 4 5 6 8
0 3 4 5
Q2
Q3
See Example 2
9
7
28 20 23
35 30 34
25
35
38
37 38
31
Calculate the range and the interquartile range of each data set in question 1.
5
2
9780170194662
6
0
6
3
7
5
8
2
9
1
9
0
10
6
14
4
14
3
15
8
16
4
30
34
See Example 3
195
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
15
38
123
58
Getty Images/Peter Harrison
See Example 4
10
11
9
3
9
52
53
10 11 12 13 14 15 16 17
c Stem
3
4
5
6
7
Leaf
2 7
0 3
2 4
3 4
2
e Stem
10
11
12
13
14
Leaf
35 5 6 6
01 2
34 6 7 8
47
1
3
5
7
5
6
d Stem
1
2
3
4
5
Leaf
3 5
0 1
5 8
1 3
4
8
3
9
48
49
91
50
51
75 72
68
The number of goals per game scored by the Sydney Swifts netball team during 2013 were:
55
35
49
53
51
55
42
48
63
43
48
48
62
a Find:
i the range
196
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Alamy/Zev Radovan
In prehistoric times, when the number of people and animals was recorded in pictures and
symbols on the walls of caves, a simple form of statistics was being used.
Before 3000 BCE, ancient Babylonians used clay tablets to record crop yields and trade data,
and around 2650 BCE the Egyptians surveyed the population and wealth of their country
before building the pyramids. Forms of statistics were also used in the Bible in the Book of
Numbers and the First Book of Chronicles. Numerical records existed in China before
2000 BCE, and the Greeks (to help collect taxes) held a census in 594 BCE. The Roman Empire
was the first government to collect information about the population. In 1086 a census was
conducted in England. The information obtained in this census was recorded in the
Domesday Book.
Use your library or the Internet to find out more about the Domesday Book. Write a onepage report suitable for a classroom presentation.
Stage 5.3
Worksheet
Statistical calculations
MAT10SPWK10209
Summary
The standard deviation is a measure of the spread of a set of scores.
The symbol for standard deviation is s or sn.
s is the lower case Greek letter
Its value is an average of how different each score is
sigma
from the mean.
Standard deviation has a complex formula so it is best calculated using the calculators statistics
mode. It is a better measure of spread than the range and interquartile range because its value
depends on every score in the data set.
9780170194662
197
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Stage 5.3
Example
Calculate, correct to two decimal places, the standard deviation of each set of data.
a The daily maximum temperature (in C) in Campbelltown for two weeks in January.
45.0
33.8
24.5
32.9
24.8
23.6
29.1
22.1
35.0
29.2
26.9
27.1
31.8
32.7
2
2
3
1
4
3
5
3
6
2
7
5
8
6
9
4
10
2
Solution
Follow the instructions for the statistics mode (SD or STAT) of your calculator as shown in
the tables below.
a
Operation
Casio scientific
Sharp scientific
Start statistics
mode.
MODE
STAT 1-VAR
MODE
SHIFT
1 Edit, Del-A
2ndF
SHIFT
Enter data
45.0
24.5
= , etc.
to enter in column
to leave table
Calculate the
standard deviation
(sx 5.75)
SHIFT
1 Var
Return to normal
(COMP) mode.
MODE
COMP
45.0
etc.
STAT
DEL
M+
24.5
M+
AC
RCL
MODE
s 5.75
198
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
Operation
ustralian Curriculum
Casio scientific
Sharp scientific
Start statistics
mode.
MODE
STAT 1-VAR
MODE
SHIFT
1 Edit, Del-A
2ndF
SHIFT
Enter data
2 = 3 = , etc. to
enter in x column
2 = 1 = , etc. to
enter in FREQ column
AC to leave table
Calculate the
standard deviation
(sx 2.26)
SHIFT
1 Var
Return to normal
(COMP) mode.
MODE
COMP
10 10A
STAT
DEL
2ndF
M+
2ndF
M+
STO
STO
RCL
MODE
Stage 5.3
s 2.26
Exercise 6-03
Standard deviation
Note: In this exercise, express all means and standard deviations correct to two decimal places.
1 Calculate the standard deviation of each set of data.
a 5
4
7
8
2
9
10
b 20
23
28
24
19
25
26
24
23
x
10
11
12
13
14
15
f
2
5
9
8
3
1
d
8
Frequency
See Example 5
6
4
2
0
2
4 5
Score
9780170194662
3
4
5
6
7
8
9
Number of DVDs watched/week
199
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Stage 5.3
An English class of Year 10 students scored the following marks for their speeches.
12
15
15
14
14
13
16
13
16
18
12
10
11
12
18
12
7
14
10
13
c What effect does removing the outlier have on the standard deviation?
For the three statistical distributions A, B and C shown, which one has:
a the greatest standard deviation?
b the smallest standard deviation?
B
8
6
4
2
0
2 3 4 5 6 7
Score
C
8
6
4
2
0
2 3 4 5 6
Score
Frequency
Frequency
Frequency
6 7 8
Marks
9 10
Stem
2
3
4
5
6
7
8
6
4
2
0
2 3 4 5 6
Score
Leaf
0 2
5 5
1 2
0 3
1 5
6
7
6
4
4
5
8
5
5
9
6
9
6
9
ii decreases?
The training times (in seconds) of a sprinter over 100 m are as follows.
11.2
11.0
10.9
12.3
11.8
11.1
11.4
11.6
11.0
59.8
58.4
56.7
60.0
55.8
57.4
58.0
An error was made in recording these times and 2 s needs to be added to each of these times.
Which of the following is true? Select the correct answer A, B, C or D.
A the standard deviation will increase and the mean will stay the same
B the standard deviation will decrease and the mean will increase
C the standard deviation will stay the same and the mean will increase
D the standard deviation and the mean are unchanged
200
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Stage 5.3
x x2
The formula for the standard deviation of a set of scores is r
where x is
n
each score, x is the mean and n is the number of scores.
The steps for calculating standard deviation are as follows.
Calculate the mean x
For every score in the data set, find the difference between the score and the mean, then
square this difference: x x2
Calculate the average of these squared deviations by adding them and dividing their sum
P
x x2
by the number of scores:
r
P
n
x x2
Calculate the square root of this average:
n
We will now use this method to calculate the standard deviation of this set of scores.
4
5
6
7
2
8
6
5
2
1 Calculate the mean of these scores.
2 Copy and complete the table below by finding, for each score, its difference from the
mean and the square of this difference.
Score, x
x x
4
1
5
0
x x2
3 Find the mean of the squared deviations calculated in the bottom row of the table.
4 The standard deviation is the square root of this mean. Calculate the standard deviation
correct to two decimal places.
5 Check your answer by calculating the standard deviation using your calculators statistics
mode and comparing both answers.
6 Use the standard deviation formula to calculate the standard deviation of each set of scores.
a 5
4
7
8
2
9
10
b 20
23
28
24
19
25
26
24
23
Check your results by using your calculator.
7 The standard deviation is never negative. Explain why.
8 If the scores of a set of data are all the same, what is the standard deviation? Explain.
9780170194662
201
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Frequency
Stage 5.3
x (the mean)
68%
x+
x + 2
x 3
Measure and analyse the heights of the students at your school. Do the data follow a normal curve?
Example
The heights (in cm) of the girls and boys in a Year 10 PE class at Baramvale High were
measured.
Girls:
Boys:
163 155 171 162 165 158 172 166 163 150 160 181 160 156
174 167 164 175 189 145 165 166 165 168 167 171 169 172 168
a Calculate, correct to two decimal places, the mean and standard deviation for:
i the girls
ii the boys
202
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Solution
Stage 5.3
Example
The ages of the children using a jumping castle and visiting a petting zoo are shown.
Jumping castle:
Petting zoo:
3
3
3
4
4
5
5
6
5
6
6
7
8
8
10
8
18
10
Solution
a For the jumping castle:
i Range 18 3
15
ii IQR 9 3:5
i Range 10 3
7
ii IQR 8 4:5
5:5
iii sn 4.48
3:5
iii sn 2.05
b The jumping castle data has an outlier, 18, that affects the range and standard
deviation. The interquartile range is the best measure for this data set.
The petting zoo data does not have an outlier, so the standard deviation is the best
measure for this data set.
9780170194662
203
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Stage 5.3
See Example 6
Exercise 6-04
Note: In this exercise, express all means and standard deviations correct to two decimal places.
1 The pulse rates (in beats/minute) of a sample of men and women taken at a suburban
shopping centre.
Men:
68 72 75 73 81 77 69 68 79 83 65 59 60 72 70
Women: 82 61 79 77 75 68 86 81 72 77 78 81 90 83 73
a Find the mean and standard deviation of each group.
b Is there a significant difference between the mean and standard deviation for men and
women? Give reasons.
2
The reaction times (in seconds) for the dominant and non-dominant hands of a group of
athletes were measured.
Dominant hand:
0.41
0.29
0.35
0.42
0.42
0.43
0.39
0.61
0.34
0.75
0.34
0.38
0.47
0.34
0.32
0.29
Non-dominant hand: 0.46
0.34
0.38
0.39
0.39
0.39
0.51
0.50
0.40
2.60
0.34
0.39
0.51
0.35
0.37
0.31
0.38
0.30
0.47
0.32
a Find the mean and standard deviation for each data set.
b Is there a significant difference between the results? Explain your answer.
c i What are the outliers for the reaction time of the dominant hand?
ii Find the mean and standard deviation without the outliers.
iii What effect does removing the outliers have on the mean and standard deviation?
d Find the mean and standard deviation of the reaction time for the non-dominant hand
without the outlier.
e On which group has the removal of outliers had the greater effect on the mean and
standard deviation? Justify your answer.
3
Western Tigers
5 2
7
8
8
7
6
5
4
5
Barrington City
7
8
9
10
11
12
13
14
15
3
0
7
4
1
7
6
6
8
6
5
Vatha and Anas times for running 100 m time trials are given below.
Vatha:
13.0
13.5
14.2
13.7
13.2
14.7
13.5
14.3
Ana:
14.2
13.2
15.1
13.8
14.2
15.2
13.9
13.5
a Find the mean and standard deviation for each runner.
b Which runner is more consistent? Give reasons.
204
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
The dot plots show the test results of a class before and after using a tutorial website.
6 7
Marks
10
6 7
Marks
Stage 5.3
10
The marks obtained by students in a Maths and Science exams are given below.
Maths:
40 72 76 74 60 64 64 59 74 84 62 84 66
71 68 78 63 57 55 73 80 67 86 57 87 62
Science: 42 54 61 72 76 54 65 80 39 74 82 54 57
64 75 68 76 81 40 37 43 58 68 67 49 54
See Example 7
64
52
63
62
The points scored per match by the Roosters and the Dragons during a NRL season were:
Roosters: 10 16 8 50 22 38 34 30 16 12 18 38 12 20 18 36 40 28 42 28 56 22 22 24
Dragons: 10 6 17 25 19 13 10 18 14 32 0 14 14 16 10 0 22 18 20 26 18 18 22 19
a For each team, find:
i the range
b By comparing the means and the measures of spread, decide which was the better team.
Mental skills 6
9780170194662
205
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
206
32 3 5
52 3 50
12 3 15
230 4 5
1000 4 25
b
f
j
n
r
14 3 5
36 3 25
22 3 35
1300 4 50
360 4 45
c
g
k
o
s
48 3 5
28 3 5
90 4 5
900 4 50
210 4 15
d
h
l
p
t
18 3 50
12 3 25
170 4 5
300 4 25
360 4 15
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Video tutorial
6-05 Boxplots
Box-and-whisker plots
MAT10SPVT10004
A boxplot (or box-and-whisker plot) displays the quartiles of a set of data and the lowest and
highest scores (lower and upper extremes).
interquartile range
box
lowest score
or lower extreme
Video tutorial
Statistics
whisker
MAT10SPVT00002
lower
quartile, Q1
Median, Q2
upper
quartile, Q3
highest score
or upper extreme
MAT10SPWK10035
The box represents the middle 50% of scores and the interquartile range, while the whiskers
represent the lowest and highest 25% of scores.
bottom 25%
middle 50%
Worksheet
Five number summaries
top 25%
Puzzle sheet
Mode, median and
mean
MAT10SPPS00044
Technology
Summary
GeoGebra:
Boxplot and dot plot
MAT10SPTC00002
Technology worksheet
Excel worksheet:
Five number summary
MAT10SPCT00002
Technology worksheet
Example
Excel spreadsheet:
Five number summary
MAT10SPCT00032
The number of hours per week that Nick worked at the Big Chicken over summer were:
5
10
12
15
Solution
a First arrange the scores in order.
3
5
Q1
median Q2
Lower extreme 3
Lower quartile 4 5 4:5
2
Median 7
9780170194662
10
12
15
Q3
Upper quartile 8 10 9
2
Upper extreme 15
207
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Q1
median
Q3
lower
extreme
0
Example
upper
extreme
4
8 9 10 11 12 13 14 15 16 17 18
Hours worked
a
b
c
d
10
20
30
40
50
60
Science test marks
70
80
90
100
ii 40 and 60?
Solution
a Range highest score lowest score
95 25
70
b Median 60
c Interquartile range Q3 Q1
75 40
35
d i 25 is the lowest score and 75 is Q3, so 75% 3 80 60 students had a mark
between 25 and 75.
ii 40 is Q1 and 60 is the median, so 25% 3 80 20 students had a mark
between 40 and 60.
e 75 is the third quartile so 25% 3 80 20 students scored more than 75.
208
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
Exercise 6-05
1
ustralian Curriculum
10 10A
Boxplots
The number of orders taken per hour at Bramavale Pizza on a weekend were:
3
5
1
2
4
6
8
10
7
6
12
15
10
3
5
18
5
8
9
10
See Example 8
The daily amount of snow (in cm) that fell at Thredbo during one ski season was:
2
20
5
12
5
5
2
40
5
50
7
10
1
40
2
13
2
30
2
5
2
35
2
2
12
6
266 149
94
15
65
19
24
34
67
28
This boxplot represents the number of hours worked in one week by the staff at a
supermarket.
20
21
22
23
24
25
26
27
28
29
30
31
See Example 9
32
Hours worked
The ages of 16 people waiting at a bus stop are displayed by the boxplot below.
15
20
25
30
35
40
9780170194662
ii 15 to 40?
209
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
The box-and-whisker plot shows the number of points per game scored by Ben in 28
basketball games during the season.
10
12
14 16 18 20 22
Points scored per game
24
26
28
30
For each set of data, find the five-number summary and draw a boxplot.
a Stem Leaf
b
2 0 2 3 5
3 3 7
4 4 6 7 8 8 9 9
5 0 1 1 5 6
10 12 13 14 15 16 17 18 19 20
6 0 3 3 8 8
Score
7 2 5 6
8 5 5 7 8
c Stem
3
4
5
6
7
8
9
Leaf
0 7
2 6
1 2
0 4
2 3
3 4
5
6
5 9
7 7 9
5 6 8
The results of a general knowledge quiz (out of 15) taken by Year 10 students are displayed by
the dot plot.
9
10
Marks
11
12
13
14
15
a Find the five-number summary for the dot plot and then draw a box-and-whisker plot.
b Describe the shape of the dot plot and compare it to the shape of the boxplot.
c What is the outlier?
d Find the five-number summary for the data in the dot plot without the outlier and draw
a boxplot.
e Compare the two boxplots. How are they:
i similar?
210
ii different?
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Technology Boxplots
In this activity we will use GeoGebra to draw boxplots.
1 Close the Algebra window so that only the graphics
window is showing.
4 To move the screen view, hold down the Ctrl key on your keyboard and use your mouse to
drag the screen across. Your boxplot should look exactly like the one below.
9780170194662
211
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Worksheet
Box-and-whisker plots
MAT10SPWK10036
8 In the input panel, enter the following formula for the results for 10B.
BoxPlot[10, 2, {77, 63, 63, 35, 51, 42, 54, 55, 71, 43, 41, 41, 40, 76, 72}]
Note: 10 means the box-and-whisker plot for 10B will be above the one for 10A (i.e. not
drawn on top of each other). You will now have two boxplots to compare.
Worksheet
Data 1
MAT10SPWK00032
Animated example
Analysing data
MAT10SPAE00002
Technology worksheet
Excel worksheet:
Parallel box plots
MAT10SPCT00004
Technology worksheet
Excel spreadsheet:
Parallel box plots
MAT10SPCT00034
Example
10
Two sprinters run the following times (in seconds) over 100 metres.
Sam
Jesse
10.5
11.4
11.0
10.1
9.9
9.8
10.7
10.8
10.5
11.4
10.0
10.7
11.2
10.3
11.5
11.1
10.3
11.6
Alamy/moodboard
a
b
c
d
e
10.9
11.0
212
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Solution
a Sam: 9.9
10.0
10.3
lowest
score
10.1
10.3
lowest
score
10.5
10.7
10.9
10.5 + 10.7
2
= 10.6
Q2 =
Q1
Jesse: 9.8
10.5
10.7
10.8
11.0
highest
score
Q3
11.1
10.8 + 11.0
2
= 10.9
Q2 =
Q1
Q3
Sam
Jesse
9.5
10.0
10.5
11.0
Time (seconds)
11.5
12.0
Exercise 6-06
Parallel boxplots
1 The parallel boxplot shows the amount of sleep that Year 8 and Year 10 students usually
get on a school night.
Year 10
Year 8
5
9
10
11
Time (seconds)
12
13
14
ii the median
b What percentage of students usually had at most 8 hours of sleep on a school night in:
i Year 8?
ii Year 10?
c 40 students in both Year 8 and Year 10 were surveyed. How many students usually had at
least 10 hours of sleep in:
i Year 8?
9780170194662
ii Year 10?
213
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
2 The number of points scored by the Adelaide Thunderbirds and the Sydney Swifts during the
2013 netball season are shown in the parallel box-and-whisker plot.
Thunderbirds
Swifts
45.5
39
35
45.5
30
40
50
61
49
72
63
55
50
60
Points scored
70
80
AAP/Jenny Evans
10K
10N
0 1
5 6
Marks
9 10
214
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
5 The monthly mean maximum temperatures for four Australian capital cities are shown in the
boxplots below.
21.1
23.7
26.9
28.4
30.4
Brisbane
17.6
20.4
23.5
25.3 26.1
Sydney
14.4
16.1
21.4
24.7
27.4
Melbourne
12.5
14.6
18.6
21.6
23.7
Hobart
12
13
14
15
16
17
18 19 20 21 22 23 24 25 26
Monthly mean maximum temperature (C)
27
28
29
30
a Find the median, range and interquartile range for each city.
b Which capital city had the most spread in temperature?
c Which capital city had the highest mean monthly temperatures? Justify your answer.
d Which city is warmer Sydney or Melbourne? Give reasons.
e Which city was more consistent Sydney or Melbourne? Give reasons.
6 The number of text messages received by a group of students in one hour are as follows.
Male:
Female:
2
4
0
5
3
6
0
3
1
7
2
5
5
8
6
7
2
4
1
2
3
4
2
5
3
10
7
4
See Example 10
4
3
167
163
171
150
169
186
a Find the five-number summary for each group and draw a parallel boxplot to display
the data.
b Find the range and interquartile range for each group.
c How does the spread of heights of male students compare with the spread of heights of
female students?
8 Students at a university were asked whether their frequency of exercise was high or low and
then had their pulse taken. The results are as follows.
Low:
90 78 80 84 70 66 92 80 80 77 64 88
High: 96 71 68 56 64 60 50 76 78 49 68 74
a Find a five-number summary for each group and then draw parallel boxplots to show the
information.
b Find the range and interquartile range for each group.
c Compare the spread between the two groups. Are there significant differences between them?
d Which group had the lower pulse rates?
9780170194662
215
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
9 The average monthly temperatures for Sydney and Brisbane in 2012 are as follows.
Sydney:
Brisbane:
26.1 25.8 24.7 23.6 20.9 17.7 17.6 19.9 22.5 23.3 24.1 26.0
28.7 29.8 28.2 26.5 24.0 21.1 21.4 23.3 25.5 27.3 28.2 30.4
a Find the five-number summary for each city and draw a parallel boxplot.
b Find the range and interquartile range for each city.
c Which city had more consistent average monthly temperatures? Give reasons.
10 These box-and-whisker plots show the numbers of points scored by two basketball players
during the season.
Simone
Amal
4
9 10 11 12
Points scored
13
14
15
16
a Which player has the highest point score for a single game?
b What is the range of the points scored by each player?
c By just looking at the range, which player would seem to be more consistent? Justify your answer.
d Find the median score of each player.
e Find the interquartile range for each player.
f Which player is more consistent?
g Estimate the percentage of games in which Simone scored 9 or 10 points.
Worksheet
Comparing city
temperatures
MAT10SPWK10037
Example
11
The back-to-back stem-and-leaf plot shows the results in Year 10 Maths and Science tests.
a
b
c
d
8 7
6
8
8 7
6 6
5 4
6
7
3
2
6
Maths
5 2
3 0
4 1
2 0
1 1
4 3
6 0
3
4
5
6
7
8
9
Science
6 8
4 6
1 5 9
0 2 8
2 3 4
0 0 2
0 4 4
9
4
4
5 8
5 6
8
7 8
Find the mean mark (correct to one decimal place) for each subject.
Find the median for each subject.
Find the range and interquartile range for each subject.
For each subject:
i describe the shape
e In which subject have the students performed better? Justify your answer.
216
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Solution
1919
30
64:0
2151
30
71:7
d i The results for Maths are symmetrical, while the results for Science are negatively skewed.
ii There is some clustering for the Maths results in the 60s and in Science the clustering
occurs in the 70s and 80s.
e The students have performed better in Science as the mean and median for it are greater
than the mean and median for Maths. The range for Maths is greater than the range for
Science, but the interquartile range is less than that of Science.
Example
12
The number of text messages received by a group of teenagers are displayed in the frequency
histogram and the boxplot below.
10
Frequency
8
6
4
2
0
0 1 2 3 4 5 6 7 8 9 10
Number of text messages/hour
2 3 4 5 6 7 8 9
Number of text messages/hour
10
a How many teenagers received more than 6 text messages per hour?
b Find:
i the mode
iii the range
ii the median
iv the interquartile range.
c The shape of the distribution is positively skewed. How is this shown by:
i the frequency histogram
ii the boxplot?
d According to the boxplot, what percentage of teenagers received 2 or more text messages?
e What information is better seen on:
i the frequency histogram
9780170194662
ii the boxplot?
217
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Solution
a Number of teenagers receiving more than 6
text messages
3211
7
b i Mode 3
ii Median 4
iii Range 10 0
10
iv Interquartile range 6 2
4
c i The tail of the frequency histogram leans
towards the higher scores.
ii The length of the boxplot to the right of the
median (Q2) is greater than its length to the
left of the median.
d Q1 2, so 75% of teenagers received
2 or more text messages/hour.
e i The mode and information regarding
the number of text messages received by
teenagers can be determined from the
frequency histogram.
ii The median, quartiles and interquartile range
are easily determined from the boxplot.
Exercise 6-07
See Example 11
8
9 6 5
8 5 5 4
5 4
6
5
5
3
4
6
Boys
5 5 3
5 2 0
5 0 0
2 0 0
2 2 0
5 4 3
4 2 2
5
0
1
2
3
4
5
6
7
Girls
55 6
02 2
05 6
01 4
00 5
03 5
55 8
04
8
5
8
5
6
9
5 8 8 9
8 8
6
e Who generally carries more cash boys or girls? Justify your answer.
218
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Frequency
2 The back-to-back histogram shows the number of goals scored by two football teams
during a season.
7
6
5
4
3
2
1
0
1
2
3
4
5
6
7
Scorpions
Goals scored
0
Vale United
ii Vale United?
Sydney
20
22
24
26
28
30
32
34
Temperature (C)
36
38
40
42
22
24
26
28
30
32
34
Temperature (C)
36
38
40
42
Perth
20
a Find the mean, median and modal temperatures for each city.
b Find the range and interquartile range of temperatures for each city.
c Describe the distribution shape of the temperatures for each city and identify any outliers
and clusters.
d Compare the temperatures in Sydney and Perth. Comment on measures of location (the
mean, median and mode), and measures of spread (range and interquartile range).
9780170194662
219
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
4 The results for two quizzes taken by a Year 10 History class are shown below.
Score
Quiz 1
10
9
8
7
6
5
4
3
2
1
Quiz 2
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10
Frequency
e Describe the distribution for each quiz, identifying any clusters and outliers.
f Are there significant differences between the results of the two quizzes? Justify your answer.
5 A survey to determine the number of people per
household was conducted in several shopping centres.
The results are shown in the frequency histogram and
boxplot on the right.
a How many households had 3 or more people?
b Find the:
i mode
iii range
ii median
iv interquartile range.
220
Frequency
See Example 12
28
26
24
22
20
18
16
14
12
10
8
6
4
2
0
2 3 4 5 6 7
People per household
3
4
5
6
People per household
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
6 The dot plot and box-and-whisker plot show the number of hours that Year 10 students spent
watching TV during one week.
10
12
14
16
18
20
22
24
Hours spent watching TV per week
26
28
10
12
14
16
18
20
22
24
Hours spent watching TV per week
26
28
b Find the:
i mode
ii range
ii the boxplot?
d Which display of data, the dot plot or boxplot, can be used to find:
i the mode?
ii the median?
iii the number of students who watched TV for 25 hours?
iv the interquartile range?
7 The speeds of cars were monitored along a main road in two different suburbs. The results are
shown in the back-to-back stem-and-leaf plot and the parallel boxplots.
Sunbeam Valley
8
9 8 8 7 4 3 3 3 2 0
9 9 6 5 5 4 4 3 3 2 2 1 1 0 0 0
2 0 0
Bentleys Beach
5
6
7
8
9
0 0 1 2 3 5 5 7 8 9
0 0 2 2 3 3 5 5 5 6 6
0 2 3 4 5 5 5 8
0
Sunbeam Valley
Bentleys Beach
50
60
70
Speed (km/h)
80
90
a Find the range, median and interquartile range for each suburb.
b What is the shape of the distribution for each suburb?
c Are there any clusters or outliers in either suburb?
d According to the boxplot, what percentage of drivers in Bentleys Beach drive faster than all
drivers in Sunbeam Valley?
e In which suburb do drivers generally drive faster? Give a possible reason for your answer.
9780170194662
221
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Frequency
8 Lamissa and Anneka each shot arrows at a target 50 m away during an archery contest. They
scored 10 for a bulls-eye down to 1 for the outer ring. Their results are displayed in the backto-back histogram and the parallel box-and-whisker plots below.
12
10
8
6
4
2
0
2
4
6
8
10
12
Lamissa
Lamissa
Anneka
Anneka
1
4 5 6 7 8
Score per arrow
9 10
9 9 9
7 6
8 8
5 5
7 4
5 4
7 5
4
3
4
8
3
2
3
Women
7 5 4
3 1 0
1 0 0
2 0 0
2 1 0
1
2
3
4
5
Men
0 6
0 2
0 2
1 3
0 1
7
3
4
4
3
9
4
5
6
4
9
4
6
6
7
5
6
6
7
5
7
6
7 7
7 8
7 7
8
8 8
9
Women
Men
10
20
30
40
Number of sit-ups per minute
50
60
a Why would a dot plot be an inappropriate way to display the data shown above?
b What is the median number of sit-ups per minute completed by each group?
c Find the range and interquartile range for each group.
d Describe the shape of the distributions for women and for men.
e Which group has more spread in the number of sit-ups completed per minute? Give
reasons for your answer.
222
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
10 The results of a Maths test given to four Year 10 classes are shown below.
10 Green
10 Red
10 Blue
10 Yellow
30
40
50
60
Test results
70
90
80
ii 10 Blue?
ii negatively skewed?
iii symmetrical?
d Which class had the best test results overall? Give reasons.
Puzzle sheet
Bivariate data is data that measures two variables, such as a persons height and arm span
(distance between outstretched arms). Bivariate data is represented by an ordered pair of values
that can be graphed on a scatter plot for analysis.
A scatter plot is a graph of points on a number plane. Each point represents the values of the two
different variables and the resulting graph may show a pattern that may be linear or non-linear. If
there is a pattern, then a relationship may exist between the two variables.
Example
MAT10SPPS10038
Worksheet
Scatter plots
MAT10SPWK00002
13
The heights and arm spans of a group of students are shown in the table.
Height, H cm
162
Arm Span, S cm 158
182
185
153
145
145
143
172
174
163
165
150
151
142
141
183
181
145
158
192
191
171
178
9780170194662
223
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Solution
a
200
190
180
170
160
150
140
140
150
190
200
Summary
The strength of a relationship between two variables can be described as:
224
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
Example
ustralian Curriculum
10 10A
14
Describe the strength and direction of the relationship shown in each scatter plot.
a y
by
cy
d y
ey
fy
Solution
a weak positive relationship
b perfect negative relationship
c no relationship
d strong negative relationship
e perfect positive relationship
f weak negative relationship
Exercise 6-08
1
Scatter plots
The heights and handspans of a group of students are shown in the table.
See Example 13
Height, H cm
168
175
175
156
160
173
171
180
185
175
182
180
Handspan, S cm
20.0
21.1
17.6
16.5
17.5
19.0
20.8
22.5
25.0
23.0
20.2
21.1
9780170194662
225
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
See Example 14
Describe the strength and direction of the relationship shown in each scatter plot.
a
b
c
3
4
Describe the strength and direction between the variables height, H and handspan, S in question 1.
The height and stride length measurements of some students are shown in the table below.
Height, H cm
174
160
158
180
169
172
171
171
148
190
166
173
Stride Length, L cm
72.2
64.0
66.4
74.7
70
71.5
70.9
71.2
61.4
78.9
68.0
71.9
Dreamstime/Vselenka
Points scored
for, F
568
579
559
497
597
545
445
481
405
506
449
448
462
497
409
431
Points scored
against, A
369
361
438
403
445
536
441
447
438
551
477
488
626
609
575
674
Year 10 students were surveyed on the number of hours in a week they spent doing homework
and the number of hours they spent on the computer. The results are shown in the table.
Homework, H
Computer, C
2 15 12 5
25 30 18 35
4
6
2 4 15 14 5
30 20 22 6 40
2
8
5
3
20 4
20 30
2
5
11
8
226
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
A survey was conducted to see whether there was a relationship between height and the age of
students in a high school. The results are in the table below.
Age, A (years)
Height, H (cm)
14 16 15 13 11 14 17 15 12 11 14 16 13 18
162 174 182 162 132 173 187 160 154 145 165 171 151 181
Stage 5.3
Worksheet
Line of best fit
MAT10SPWK10210
Worksheet
Data 2
Summary
A line of best fit:
between the points on the scatter plot, within the range of data (this is called interpolation,
pronounced in-terp-o-lay-shun), or
beyond the points on the scatter plot, outside the range of data (this is called extrapolation,
pronounced ex-trap-o-lay-shun).
9780170194662
MAT10SPWK00033
Technology worksheet
Excel spreadsheet:
Line of best fit
MAT10NACT00033
Technology worksheet
Excel worksheet: Line
of best fit
MAT10NACT00003
227
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Stage 5.3
Example
15
The arm span and right foot size of 12 Year 10 students were measured.
Arm span, S (cm) 177 179 162 182 181 171 161 176 175 190 168 165
Right foot size,
25 26 24 28 27 25 23 25 24 30 24 24
F (cm)
a
b
c
d
e
Graph the points on a scatter plot and construct a line of best fit.
Find the equation of the line of best fit.
Use the equation to estimate the foot size of a student with an arm span of 173 cm.
Use the graph to interpolate the foot size of a Year 10 student with an arm span of 185 cm.
Use the graph to extrapolate the arm span of a Year 10 student who has a foot size of 31 cm.
Solution
a
40
30
20
10
150
160
170
180
Arm span, S (cm)
190
200
210
b Use the pointgradient formula y y1 m(x x1) to find the equation of the line.
y2 y1
m
x2 x1
27 20
Using two points on the
181 150
line (150, 20) and (181, 27).
7
31
0:226
y 20 0:226x 150
Using the point (150, 20).
0:226x 33:9
y 0.226x 13.9
F 0.226S 13.9
x and y replaced by S and F respectively.
c When S 173 cm,
F 0:226 3 173 13:9
25:198 cm:
A Year 10 student with an arm span of 173 cm would have a foot size of 25.198 cm.
228
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Stage 5.3
40
30
20
10
150
160
170
180
Arm span, S (cm)
190
200
Exercise 6-09
1
210
Forensic scientists can estimate peoples heights from the lengths of their bones such as the tibia,
femur, humerus and radius. The table below gives the heights of females and the length of their radius.
Length of radius, r (cm)
Height, H (cm)
25.2 22
173 158
20.4
152
23.5
167
24.3
169
See Example 15
21.4
156
190
Height, H (cm)
180
170
160
150
140
19
20
21
22
23
24
25
Length of radius, r (cm)
26
27
28
a Plot the points on a scatter plot as shown and construct a line of best fit.
b Find the equation of the line of best fit.
c Use your equation to find the height of a female whose radius is 25 cm long.
d If the radius is 27 cm in length, use the line of best fit to predict the height of the female.
9780170194662
229
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Stage 5.3
The heights and shoe sizes of a group of Year 11s were measured and recorded below.
Height, H (cm) 175
174
177 180
179
176
170
Shoe size, S
10
10
11
9.5
7.5
10.5
12
175 179
9
180
178
183
178 173
11.5 12.5
11
12.5
12
179
9.5 10.5
174
9
a Graph the points on a scatter plot and construct a line of best fit.
b Find the equation of the line of best fit.
c Use the equation to estimate the shoe size (to the nearest 0.5) of a student whose height is 172 cm.
d Use the graph to interpolate the shoe size of a student who is 181 cm tall.
e Use the graph to extrapolate the shoe size of a student with height 185 cm.
3
The air temperature, T (C) was measured at various heights, h (m), above sea level.
Height, h (m)
500
1000
2000
2500
4000
5900
7500
10 000
Temperature, T (C)
20
14
5
13
20
35
50
a Graph the points on a scatter plot and construct a line of best fit.
b Find the equation of the line of best fit.
c Use the equation to estimate the temperature at a height of 1500 m.
d Use the graph to find the height above sea level for a temperature of 10 C.
4
The results obtained by 18 Year 10 students in Maths and Science exams are shown below.
Maths
Science
59 52 72 85 75 45 65 64 62 58 78 90 40 70 50 45 82 50
65 54 67 83 75 39 59 64 60 56 80 95 38 65 48 48 85 51
a Graph the points on a scatter plot and construct a line of best fit.
b Simone missed the Science test but obtained 80 in her Maths exam. Use the line of best fit
to predict Simones Science result.
c If Mario obtained 96 in the Science exam, predict what result he might have achieved in the
Maths exam.
5
Angela is measuring the amount by which a spring is stretched when different masses are hung
from the spring for a Science experiment. Her results are as follows.
Mass, M (g)
Spring stretch, S (cm)
10
5.9
20
11.2
25
12.3
30
14.8
35
17
40
22.4
50
25.2
a Graph the points on a scatter plot and construct a line of best fit.
b Use the line of best fit to predict the length the spring stretches for a mass of 45 g.
c What mass would have to be attached to stretch the spring 28 cm?
d Are there limitations to using the line of best fit to predict the length of stretch in the spring
by different masses?
6
The mens 100 m world record times for 1964 to 2009 are given in the table below.
Year
1964
1968
1983
1988
1991
1994
1996
1999
2005
2006
2007
2008
2009
Time (s)
10.06
9.95
9.93
9.92
9.86
9.85
9.84
9.79
9.77
9.76
9.74
9.69
9.58
230
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Stage 5.3
40
162
42.9
165
44.2
164
46.1
173
46.8
174
47
178
48.4
179
50.3
182
51.2
186
57.2
200
1 Enter the data from the table into a spreadsheet. Type Length of femur in cell A1 and
Height in B1.
2 To graph a scatter plot, select all the values in cells B1 to K2, and under the Insert menu,
select Scatter and Scatter with Straight Lines and Markers.
3 To draw the line of best fit, select one of the points on the scatterplot and right-click. Select
Add Trendline, Linear and Display Equation on chart, then Close.
4 Check your answers to questions 13 from Exercise 6-09 using a spreadsheet.
Example
16
This table shows the average household size between 1961 and 2011, according to the Census.
Year
Average
household size
1961
1966
1971
1976
1981
1986
1991
1996
2001
2006
2011
3.6
3.5
3.3
3.1
3.0
2.9
2.8
2.6
2.6
2.6
2.6
iStockphoto/Yuri
9780170194662
231
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Solution
Year is the independent variable.
Average number of persons per household
a
4.0
3.0
2.0
1.0
0
1960
1970
1980
1990
Year
2000
2010
2020
b The average household size decreased from 3.6 in 1961 to 2.6 in 1996 and since then
there has been little or no change.
c 2.4 2.6 people per household.
Exercise 6-10
1
The number of people employed per month at SUPA SAVE SUPERMARKET from
November 2009 to February 2012 is displayed in the time series graph below.
Number of employees
40
30
20
10
0
N D
F M A M
2010
S O N D
F M A M
2011
S O N D
2012
Months
ii December 2010?
b In which month of the year were the most people employed by the supermarket? Suggest a
reason why.
c In which month of the year were the least number of people employed? Suggest a reason why.
d Describe how the number of people employed by the supermarket changes from November
2009 to February 2012.
232
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
The population figures for Australia from 1960 to 2010 are given in the table below.
Year
Population
(millions)
See Example 16
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
10.28
11.39
12.51
13.89
14.70
15.76
17.07
18.07
19.15
20.39
22.3
ii 2045.
The table below shows the fatalities on NSW roads from 1950 to 2010.
Year
Fatalities
1950
634
1960
978
1970
1309
1980
1303
1990
797
2000
603
2010
405
The annual mean maximum temperatures for Sydney from 19902012 and from 20012012
are given in the tables below.
Year
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
Temperature (C)
22.3
22.8
21.5
22.3
22.6
21.8
22.1
22.4
22.7
22.1
22.7
23.1
Year
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
Temperature (C)
23.1
23.1
22.7
23.4
23.4
23.1
22.7
22.1
22.1
22.6
22.6
22.7
ii 2001 to 2012.
ii 2001 to 2012?
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
509.5
514.5
529.2
530.2
539.8
546.5
554
542.8
551.8
553.2
551.9
Annual
emissions
(Mt CO2-e)
9780170194662
233
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
ii 2025?
The time series graph below shows the monthly amount of passenger traffic on Australian
domestic commercial airlines.
Passenger movements (millions)
5.5
5.0
4.5
4.0
3.5
3.0
Jun- Oct- Feb- Jun- Oct- Feb- Jun- Oct- Feb- Jun- Oct- Feb- Jun- Oct- Feb- Jun08 08 09 09 09 10 10 10 11 11 11 12 12 12 13 13
Month
Source: Australian Government, Department of Infrastructure and Transport http://www.bitre.gov.au/statistics/
aviation/domestic.aspx#summary
a Describe the trend in domestic passenger traffic for June 2008 June 2013.
b What was the approximate amount of passenger traffic per month in:
i June 2008?
ii June 2010?
iv June 2013?
c What was the percentage increase in domestic passenger movements from June 2008 to
June 2013?
234
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Example
123rf/Oleksiy Mark
17
Solution
The newspaper is reporting about its own readership and so may be biased. It also states that
its Travel liftout has a higher readership that its issue readership.
9780170194662
235
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
Example
18
The weights (in kg) of a large group of 1820-year-olds attending University are:
57
58
62
84
64
74
57
55
56
90
68
63
49
66
63
65
60
60
46
70
85
60
70
41
73
75
67
63
70
85
51
49
75
77
87
54
60
75
58
68
55
65
66
57
85
75
56
60
62
75
74
58
51
62
50
55
71
57
58
100
72
58
103
64
52
55
80
96
45
87
81
80
48
54
65
54
59
50
78
60
74
70
64
59
72
78
104
63
102
95
a How many students were in the group?
b Randomly select four groups of 10 and for each sample calculate:
i the mean
ii the median
c Use your results to estimate the mean, median and interquartile range of the population
from your four samples.
d Compare your estimates to the mean, median and interquartile range of the population.
Solution
a There were 90 students in the group.
b Randomly select four samples of 10 from the population.
Sample 1:
90
63
75
48
74
85
51
96
Sample 2:
62
75
103
64
65
54
55
54
Sample 3:
68
70
57
52
78
74
60
63
Sample 4:
72
54
52
80
45
87
49
77
The statistics for each group are:
median 74.5 interquartile range 25
Sample 1: x 72
Sample 2: x 66.7 median 63
interquartile range 20
Sample 3: x 66.7 median 65.5 interquartile range 16
interquartile range 25
Sample 4: x 62.8 median 56
60
60
58
54
78
75
87
58
63
236
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
Exercise 6-11
1
ustralian Curriculum
10 10A
A TV network surveys 300 people in shopping centres between 9 a.m. and 11 a.m. to get
feedback on its new game show.
a How may this survey be biased?
See Example 17
b Suggest a better method for obtaining feedback about its game show.
2
A report about hot-water systems recommended a heat pump system. The report stated that
people in Queensland who had the heat pump hot-water system saved 30% of their electricity
bill per quarter. The company is using this information in their advertising of the product in
NSW and Victoria.
Should people in NSW and Victoria install this type of hot-water system? Give reasons.
A report on petrol pricing was conducted by two companies. The following graphs, showing
the price of petrol for the same 12-week period, were used to present their findings on the
price of petrol.
Company A
Cents/litre
27
11
18
December
25
8
15
January
22
29
12
5
February
Company B
Petrol pricing: Company B
146
Cents/litre
144
142
140
138
136
134
27 4 11 18 25 1 8 15 22 29 5 12
December
January
February
ii Company B?
b How could both graphs be improved to give a true picture of changing petrol prices?
9780170194662
237
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
4.8
Fuel consumption (L/100km)
4.6
4.4
4.2
4.0
3.8
3.6
Ford Fiesta
Volvo
BMW
Hyundai
A company manufactures a product. After 3 months, they conduct a survey and customers are
asked to rate the product as Excellent, Good or Satisfactory. Is the survey biased? Justify your
answer.
A market research company working for a car manufacturer needs to determine the most
popular car colours.
a Give an example of a biased question for this survey.
b What other information should the market research company use, apart from the survey, to
determine the most popular colour car?
See Example 18
a Randomly select four samples of 10 weights from the population shown in Example 17, and
for each sample calculate:
i the mean
ii the median
b Use your results to estimate the mean, median and interquartile range of the population
from your four samples.
c How do the statistics of your samples compare to the mean, median and interquartile range
of the population?
d How do the estimated statistics compare to the population statistics?
8
ii 15
iii 20
b Do the sample statistics become more accurate and move closer to the population statistics
as sample size increases?
238
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Stage 5.3
The graph compares the number of passenger vehicles per 1000 people in Australia in 1955
and 2013.
no.
1000
800
600
400
200
0
1955
2013
Source: www.abs.gov.au
a How many passenger vehicles per 1000 people were there in 1955?
b What was the percentage increase in the rate between 1955 and 2013?
2
Visit the Australian Bureau of Statistics (ABS) website www.abs.gov.au and search for Motor
vehicle census.
a What was the total number of vehicles registered last year?
b How many passenger vehicles were registered last year?
c What was the average annual growth rate over the last five years?
This graph compares the types of commuter transport used by Australians in 2009 and 2012.
MAIN FORM OF TRANSPORT USED TO GET
TO WORK OR FULL-TIME STUDY, 2009 AND 2012
%
100
2009
2012
80
60
40
20
0
Passenger Public
Vehicle transport
Walk
Bicycle
Motorbike
Other
Source: www.abs.gov.au
9780170194662
239
Chapter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Investigating data
a What percentage of people used a passenger vehicle to get to work or full-time study in
2012 and what change has occurred from 2009 to 2012?
b What percentage of Australians used public transport in 2012? Visit the ABS website to see
whether this has changed this year.
c What reasons are there for Australians not using public transport? (Search Car nation at the
ABS website)
d List three advantages of using public transport.
Stage 5.3
e Use the Internet to compare Australias transport use with transport use in other countries
(for example, China, Indonesia, Japan and USA).
4
These graphs compare the types of commuter transport used by state, territory and capital city in 2011.
ALL METHODS(a) OF TRAVEL TO WORK(b) BY STATE AND TERRITORIES, 2011
Passenger vehicle
Public transport
Walk
Bicycle
NSW
VIC.
QLD
SA
WA
TAS.
NT
ACT
0
20
40
60
80
100
a Which state had the highest proportion of people using passenger vehicles to travel to work?
b What percentage of people in South Australia used public transport?
c Which capital city had the highest public transport use and which city had the lowest? Give
possible reasons for your answer.
ALL METHODS(a) OF TRAVEL TO WORK(b) BY CAPITAL CITY, 2011
Passenger vehicle
Public transport
Walk
Bicycle
Sydney
Melbourne
Brisbane
Adelaide
Perth
Hobart
Darwin
Canberra
0
20
40
60
80
100
d The percentage of people using public transport in capital cities is higher than the
percentage of people using public transport in the State. Give a possible reason for this.
5
240
e Which state and which capital city had the lowest percentage of people using passenger vehicles?
Summarise your answers to questions 1 to 4 in a brief report about passenger vehicle use in
Australia. Using your results, indicate what action governments (Federal, State and local) should
take in terms of building roads, accident research and consideration of the environment.
9780170194662
N E W C E N T U R Y M AT H S A D V A N C E D
for the A
ustralian Curriculum
10 10A
Is Australia becoming a warmer continent? Investigate this by looking at data from the
Australian Bureau of Statistics and the Australian Bureau of Meteorology (www.bom.gov.au).
Investigate tobacco and alcohol use by teenagers in Australia. Include tables and graphs in
your report. Refer to the National Drug Strategy Household Survey
(www.nationaldrugstrategy.gov.au) and NSW Health (www.healthinsite.gov.au), and search
alcohol and teenage statistics in Australia on the Internet.
Stage 5.3
Power plus
1
The strength and direction of the relationship between two variables can be measured by
the correlation coefficient (r).
a Between which two values does the correlation coefficient lie?
b What is the strength and direction of the relationship if the correlation coefficient is zero?
c Write a possible value for the correlation coefficient for each relationship described.
i perfect positive
ii weak negative
Two variables may have a strong relationship, but this does not mean that a change in one
variable causes a change in the other. Which of the following pairs of variables have a
causal relationship?
a height and weight of people
b the time that it takes to walk to school and the distance from home to school
c the number of children per household and the number of mobile phones per household
d the age of people and their reaction time
e the price of petrol and the amount of petrol sold
f the interest rate of loans and the number of new housing loans
The following scores are the test results on a History exam for a class of 20 students.
13
14
16
12
14
16
18
13
15
10
9
15
13
14
13
10
8
14
16
14
a Find the mean, median and mode.
b Find the range and interquartile range.
c An error was made in recording the scores and 4 marks need to be added to each
score. What effect will this have on the statistics calculated in parts a and b?
9780170194662
241
Chapter 6 review
n Language of maths
Puzzle sheet
Data crossword
bivariate data
boxplot
cluster
dependent variable
interquartile range
MAT10SPPS10039
mean
measure of location
measure of spread
median
Quiz
mode
negatively skewed
outlier
positively skewed
quartile
range
scatter plot
skewed
standard deviation
strong
symmetrical
weak
Statistics
MAT10SPQZ00002
n Topic overview
Copy and complete this mind map of the topic, adding detail to its branches and using pictures,
symbols and colour where needed. Ask your teacher to check your work.
Shape of a distribution
Standard deviation
Boxplots
0 10 20 30 40 50 60 70 80 90 100
Science test marks
2
9 10 11 12 13 14 15
9.5
Investigating
data
Comparing data sets
10.0
10.5
11.0
Time (seconds)
11.5
12
Bivariate data
involving time
4.0
28
26
3.0
24
22
2.0
20
18
16
1.0
14
1
12
Scatter plots
10
8
0
1960
1970
1980
1990
Year
2000
2010
2020
6
4
2
0
0
Women
Men
10
242
20
30
40
50
60
9780170194662
Chapter 6 revision
1 For each statistical distribution:
i describe the shape
10 11 12 13 14 15 16 17
Stem
3
4
5
6
7
8
9
Leaf
0 1
1 3
0 4
3 7
0 1
4
8
2
4 4 5 6
5 7 8
8
15
23
28 20
20
18
30
21 18
d
Score
10
11
12
13
14
15
Stem
3
4
5
6
7
Leaf
0 1 2
3 5 8 8 9 9 9
4 5 6 6 8
0 1 3 7
2
Frequency
3
8
15
18
10
5
3 The reaction times (in seconds) of a sample of truck drivers were measured.
0.34 0.35
0.34
0.37
0.42
0.45
0.43
0.29
0.38
0.40
0.37
Stage 5.3
0.62
a Find, correct to two decimal places, the mean and standard deviation.
b Find the range and interquartile range.
c Which is the best measure of spread for this set of data? Justify your answer.
4 The Health exam results for a class of PE students are shown here.
Girls: 83 78 63 84 65 51 76 69 42 84 60 64
Boys:
65 34 75 68 56 63 79 55 68 52 49 85
a Find the mean and standard deviation of both groups.
b Which group performed better in the exam? Give reasons.
9780170194662
92
64
73
58
32
54
243
Chapter 6 revision
See Exercise 6-05
5 The number of goals scored by the Under-18s Vale soccer team are:
2
0 0
4
2
1
1 2
3
2
3 7
4
3
1
0 4
2
a Find the range and interquartile range of the scores.
b Find the five-number summary for the data.
c Draw a box-and-whisker plot for the data.
6 The pulse rates of students were taken before and after exercising. The results were:
Before exercise: 78 80 66 70 56 64 68 65 50 76 80 70 70 59
After exercise: 141 140 89 95 110 126 84 82 90 88 146 98 96 92
a Find the five-number summary for the pulse rates before and after exercise and construct
a parallel boxplot to display the two sets of data.
b Find the range and interquartile for:
i before exercising
ii after exercising.
c Compare the two sets of pulse rates. Are there significant differences between them?
Justify your answer.
7 The speeds of cars (in km/h) were monitored between 1:00 and 1:30 p.m. on a main road.
The results are shown in the stem- and-leaf plot and boxplot below.
Stem
7
8
9
10
11
12
Leaf
0 3
0 2
0 0
0 0
0 1
6
7
2
1
4
9
3 5 6 8 8 9
3 5 5 7 8 9 9 9
4 6
99.5
70
80
90
100
Speed (km/h)
110
120
130
244
9780170194662
Chapter 6 revision
8 Eleven boxes containing 60 oranges each were placed in cold storage for different periods.
After storage, the number of good oranges in each box was counted.
Weeks in storage (W)
Number of good
oranges (N)
12
14
10
11
58
50
33
40
28
50
52
38
35
55
33
a
b
c
d
Stage 5.3
152 160 179 180 185 174 165 150 145 142 155 153 175 155
50 65 72 77 81 77 65 57 48 53 61 67 72 56
Graph the points on a scatter plot and construct a line of best fit.
Find the equation of the line of best fit.
Use the equation to estimate the weight of a student who is 170 cm tall.
Use the graph to interpolate the weight of a student with height 163 cm.
Use the graph to extrapolate the height of a student who weighs:
i 85 kg
ii 45 kg
10 The mean maximum temperatures in Blacktown, NSW for the month of January from 2004 to
2013 are shown in the table. (Source: Bureau of Meteorology)
Year (t)
Temperature (T, C)
2004
30.6
2005
29.1
2006
29.0
2007
30.1
2008
28.5
Year (t)
Temperature (T, C)
2009
31.7
2010
30.6
2011
29.9
2012
27.4
2013
30.0
9780170194662
245