Chapter 12 - Data analysis and probability
Chapter 12 - Data analysis and probability
analysis
and
probability
12
Data analysis and probability
This chapter at a glance
Stage 4
After completing this chapter, you should be able to:
state whether a sample or a census should be used to collect information in
a given situation
detect bias in the selection of a sample
use spreadsheets to tabulate and graph data
generate a random sample by using a table of random numbers, a calculator
and a spreadsheet
use sampling techniques to generate a random sample
make predictions from a sample that may apply to the whole population
construct and critically analyse questionnaires
calculate the mean, median, mode and range of a small set of scores
calculate the mean of a set of scores by using a calculator
calculate the mean, median, mode and range of data presented in a frequency
distribution table, frequency histogram, frequency polygon, dot plot or
stem-and-leaf plot
solve simple problems involving the mean, median, mode and range
compare two sets of data by finding the mean, the median, the mode and
the range of both sets
choose the best average in a given situation
design and conduct a statistical investigation
use the language of probability to refer to the likelihood that an event will occur
list the sample space for a particular event
find the probability that an event will occur
identify the complement of an event and determine its probability
solve simple probability problems.
429
430 Mathscape 8
Statistics is a branch of mathematics concerned with the collection, organisation and analysis
of numerical information called data. Like much of mathematics, statistics enables us to
better understand events that are happening around us. Statistical analysis is used in many
industries to analyse past events and from this make predictions about future trends, such as
the demand for certain products. State governments and local councils use statistical data to
determine whether new roads, schools and hospitals need to be built in certain areas, and
whether there is sufficient public transport to cater for the needs of a community.
Insurance companies set their annual premiums according to the risk they are taking to insure
something or someone. For example, statistics show us that there is a much higher chance
of a driver 18–25 years of age being involved in an accident than an experienced driver.
Therefore, insurance companies charge young people more to insure their car because of the
greater risk that the insurer will have to pay out at some time in the future.
People in some occupations present facts in a way that is intended to mislead or to give a
false impression. We need a sound understanding of statistics in order to gauge the truth about
such events as they are reported to us.
Example 1
EG State whether a census or a sample should be used to gather information about:
+S
a the yearly exam results of all Year 8 students at a school
b the average number of toothpicks in a box
Solutions
a The population of Year 8 is relatively small and the exam results of all students must be
recorded so that individual reports can then be issued. A census is needed in this situation.
b It is not practical to open every box of toothpicks and count the contents. A sample would
be chosen in this situation.
Exercise 12.1
1 State whether a census or sample has been used in each of the following situations.
a The items available for sale in a store were listed in a stocktake.
b Every student and teacher in a school voted for a new school captain and vice-captain.
c The exam marks for a group of Geography students were recorded for their reports.
d Every fifth person entering a night club was searched by a police sniffer dog.
e All bags were X-rayed at an airport to search for dangerous weapons.
f A woman bought 5 tickets in a $2 lottery.
g A group of teachers were selected to participate in an evaluation of the school’s
principal.
■ Consolidation
2 State whether a census or a sample should be used to determine:
a the average pulse rates of joggers in a large suburban park
b the weights of the forwards in a particular high school rugby scrum
c the reaction times of drivers in NSW
d the number of adults who have quit smoking in the last year
e the number of people who voted for each party in an election
f the number of Australian roads that have potholes
g the number of students at a particular school
h the average length of time that Sydney traffic lights show red
i the number of people who travel to work by public transport
j the average shoe size of Australian men and women
k the number of Australians who have spent time in gaol
l the number of parking spaces in a certain parking station
m the average number of M & Ms in a box
n the number of police officers in NSW
o the number of stars in the Milky Way
p the number of road accidents that have occurred at a particular intersection in the last
5 years
432 Mathscape 8
3 State whether it would be more appropriate to gather information about each situation using
a sample or a census. Give reasons for your answers.
a the population of Kabul just after the defeat of the Taliban in late 2001
b the level of ultraviolet-ray protection offered by a particular line of T-shirts
c the number of books sold at a book store during the month of May
d the travel methods of all students in your school
e the IQ of every person in Australia
f the level of relief provided to asthma sufferers by a new drug
g the types of accidents occurring in NSW schools
h the total water usage by a particular household over a one-month period
i the level of allergic reaction to tick bite by residents of NSW
4 Name a situation in which the collection of all possible data in the form of a census would be:
a too expensive b physically impossible c too time-consuming
■ Further applications
5 In general, statisticians believe that a sample should have at least n elements to be
representative of a well-defined population with n elements. This rule of thumb is best
applied only to large populations. The larger the population, the better the sample should
reflect its overall characteristics. Use this estimate of sample size to answer the following
questions.
a What is the minimum number of people who should be surveyed to gather information
from the following populations? Answer to the nearest 10 people.
i 900 ii 5000 iii 30 000 iv 100 000
b A sample of 140 people was selected at random from a local community. The people
were then surveyed to find the community’s views on the need to increase the number
of buses in the area. How many people probably live in the area being surveyed?
Answer to the nearest 1000 people.
c If the population of Australia is approximately 18 million, find the minimum number
of people who should be surveyed in order to predict the winner of an upcoming federal
election. Answer to the nearest 100 people.
Consider, for example, what might happen if a large pharmaceutical company deliberately
changed the data gathered from experiments on a new drug. If they falsified their findings or
exaggerated the effectiveness of the drugs, then the lives of many people around the world
could be put in danger. It is very important that data is gathered honestly and accurately in a
census, and randomly when a sample is being used, so that any conclusions drawn are reliable.
Example 1
EG Comment on any possible bias in each of the following.
+S
a A survey is conducted at a large suburban bus stop between 7:00 am and 9:00 am to find
out whether people prefer to use public transport than drive in peak-hour traffic.
b A telephone poll is conducted between 10:00 am and 2:00 pm to determine whether more
men or women would attend a new gym if it were to open in the area.
Solutions
a The sample is biased towards people who already use public transport. Although some
members of the sample may not have a car, the majority are more likely to have made a
deliberate decision to use public transport.
b The sample is biased towards women, who are more likely than men to be at home during
these hours.
Example 2
EG A freeway extension is being planned that will pass through two residential suburbs. If you
+S wanted to show the government that people were not in favour of this proposal, who should
you choose as your survey sample? Why?
Solution
The people who live in the suburbs through which the road is to be built should be surveyed
because they are not likely to want the extra noise and pollution that comes with a freeway
extension.
Example 3
EG The school principal wants to determine whether the parents of the boys at his school would
+S be in favour of the school becoming co-educational in two years’ time by admitting girls in
Years 11 and 12. He sent a questionnaire to the parents of the 1000 students asking for their
opinion, but received only 10 replies. Of these, 8 parents were in favour of the proposal. Is it
appropriate that the principal base his decision on this response? Explain your answer.
Solution
No, the sample is far too small for the principal to base his decision on it. There is every
likelihood that it may be biased.
434 Mathscape 8
Exercise 12.2
1 The following techniques are used by various groups within the community to gather
information. State at least one problem with each of these methods.
a door to door interview b telephone poll c postal questionnaire
2 A group of students at a co-educational high school intend to conduct a survey among their
fellow students to determine the most popular school activity. Explain how each of the
following student samples may lead to a biased result.
a the students in your own friendship group
b the students in Year 8
c the girls
d the members of the debating club
3 A newspaper poll asked the question ‘Do you think that today’s teenagers have less respect
for adults than teenagers in previous generations?’ A telephone number was provided for
people to call and vote ‘Yes’ and another telephone number was provided for those who
wanted to vote ‘No’. Is this poll likely to lead to a biased outcome? Why?
■ Consolidation
4 In groups, discuss each of the following situations and comment on any bias that may occur.
a Senior students are asked whether they should be able to park inside the school.
b The police commissioner is interviewed about rising crime rates in the state.
c Year 8 students are asked to decide whether homework should be banned for all
students in the school.
d People are interviewed in a shopping centre about the impact of the GST on prices.
e Residents living next to a smoke stack on the M5 freeway are asked to comment on the
link between air pollution and health problems such as asthma.
f A survey is conducted in peak hour at the toll booths on the Sydney Harbour Bridge to
determine whether tolls on major roads are necessary to raise money for the state or
whether they should be removed.
g The leader of the federal opposition is asked to comment on the national unemployment rate.
5 Who would you choose as your sample if you were trying to ensure support for each of the
following causes?
a reduce the amount of logging in national forests
b prevent the building of a nuclear reactor
c end to medical experiments on animals
d ban smoking in restaurants
e remove STD rates on telephone calls in the outer suburbs of Sydney
f remove the GST on books
g increase pensions for the elderly
h increase the amount of backburning to help prevent bushfires
i stop the slaughter of whales
j stop bank closures in small country towns
Chapter 12: Data analysis and probability 435
7 A die was rolled 200 times and the results were recorded. The average of these scores is 3.5
(1 decimal place).
62265 24425 33522 34451 24554 22242 33645 34341 25555 22412
33452 33555 44133 23354 66256 35343 63123 42356 54544 32553
35352 31452 32126 25413 32513 45622 36234 36332 22134 52251
32262 46433 14441 43644 44554 14254 62223 35135 55225 24443
a Is the data quantitative or categorical? Why?
b Find the average of:
i the first 5 scores ii the last 5 scores iii any set of 5 scores
c Find the average of:
i the first 15 scores ii the last 15 scores iii any set of 15 scores
d Select any group of 40 scores and find their average.
e Explain why it is likely that the average would vary from sample to sample.
8 The data below shows the gender of 150 Year 12 students as they filed into an auditorium
on graduation day.
fffmf mmfmf fffmf ffmff mmmff fmfmm
ffmff ffmmf mmmmf mmmfm ffmff ffmmf
fmfmf mmmff mfmff fmfmm fffff fffmf
ffmfm mfmfm mfffm mmffm ffmff fmffm
fffmf fmmfm fmfmf fffmf ffffm mfffm
a Does the data show a census or a sample? Why?
b Is the data quantitative or categorical?
c What percentage of all students are: i male? ii female?
d What percentage of:
i the first 5 students to enter are male? ii the second 5 students are male?
e Is a group of 5 students sufficient to show the male to female makeup of this year
group?
f Choose a sample of 30 students. Does it better reflect the overall balance of males and
females in the year group?
g Comment on the relationship between the size of the sample and its ability to reflect the
overall characteristics of the population.
■ Further applications
9 Comment on any possible bias in each situation.
a In a telephone survey the interviewers ring the first person listed on every tenth page.
b Wanting to evaluate students’ opinion on an aspect of school performance, the Principal
sends a questionnaire to all of the previous year’s HSC candidates. The letters are sent
to the last known address, but only 32% of the questionnaires are returned.
436 Mathscape 8
c A council alderman wishes to know opinions of residents in her ward. She selects five
streets that she knows and visits every tenth house in the street, asking five questions
of the person who answers the door. If there is no-one home she moves to the house
next door. She finished two shorter streets on Saturday morning and another on
Saturday afternoon. The other two streets were completed on Wednesday and Thursday
evenings.
d A TV interviewer stops passers-by in a busy shopping centre to ask their opinions on
the performance of the state government.
■ Types of questions
These are commonly used types of questions.
a Selection of a response from a small number of possibilities. This may be true–false,
yes–no, or selection from a number of possibilities. These questions often ask the subject
to tick the box that is most appropriate and are very easy for the respondent to complete.
They are ideally suited to discrete data, e.g. My planned career is in:
Business Industry Education
Health Recreation Other
but can be used for continuous data with predetermined grouping, e.g. My age on 1 January
will be:
less than 10 10–19 20–29
30–39 40–49 50–59
b Indicating position on a scale. These questions allow respondents to respond on a
continuous scale, e.g. I enjoy French lessons.
• • • • •
Strongly Disagree Neither Agree Strongly
disagree agree nor agree
disagree
While this type of question allows for responses on a continuous scale, i.e. on a very large
array of possible points, respondents are often encouraged to choose from the stated
feelings, e.g. I enjoy French lessons. Circle a number.
1 2 3 4 5
• • • • •
Strongly Strongly
disagree agree
c Open-ended questions allow the respondent to write an answer and they can interpret the
question and answer in many ways. These questions provide very clear data but are much
harder to interpret and categorise.
■ Types of observation
a Direct human observation may involve one or more people. They may observe openly or
be hidden from the view of the subjects, e.g. observing birds from a hide. This has the
advantage that the observer can intelligently modify the observation program as events are
noticed. It has the disadvantage that it can be very subjective, the viewer only sees what
they ‘want to see’, and very limited, as an observer can only watch one subject at a time.
b Indirect observation may make use of audio or video recordings, which allows repeated
viewing by the researchers later. It has technical difficulties, e.g. the position viewed is fixed.
438 Mathscape 8
c Mechanical data collection involves the use of devices to record physical data, e.g. a
thermometer attached to a computer could record a patient’s temperature on a continuous
basis. A major advantage is that the data is already on a computer ready to be processed,
and we can easily create tables, graphs, etc. to represent it.
Example
EG A researcher has chosen to include these questions in a questionnaire surveying part-time
+S work by secondary-school students.
Comment on the nature of each question and the suitability of the type of question.
a State your age in whole years. ____
b State your gender. Tick the box. female male
c Do you do part-time work? Yes No
d How many hours of part-time work do you do each week? ____
e When do you usually work? ___________________
f Where do you work? _____________________
g What hourly pay rate do you receive?
less than $10 $10–20 more than $20
h Do you prefer your work to school? Yes No
Solutions
One difficulty with questions such as these is that they are not specific about just what the
survey hopes to discover. The researcher may, or may not, be interested in the type of work,
the companies employing the respondents, when in the year they work, etc. This makes it
difficult to decide the relevance of some of the questions.
a This seems a straightforward question and is likely to be important information for the
survey. Some people might be unsure what is meant by ‘in whole years’ and take it to
mean to the nearest whole year.
b This is relevant and the tick-the-box format makes it easy for the respondent.
c This question seems appropriate until one sees question d. If one answers ‘No’ to c then
there is no need for d. The two questions involve repetition, and there are a number of
solutions to this problem. Perhaps the simplest is to omit c. If a respondent does not do
part-time work they can write ‘0’ in answer to d.
d There is considerable ambiguity with this question! Part-time workers often work
different numbers of hours from week to week, and the respondent may be unsure which
week to mention. Many secondary students would work much longer hours during the
school holidays, especially just before Christmas or at Easter. This question seems
relevant but needs to be expressed much more clearly.
e This question may not really be relevant. Part-time workers often work a large variety of
shifts and they would find it very difficult to answer.
f This question is ambiguous and responses might include ‘at a service station’, ‘not far
from home’, ‘at the XYZ department store’ and many others. It is difficult to imagine the
researchers being able to draw conclusions from the information they are likely to receive.
Chapter 12: Data analysis and probability 439
g The respondent simply has to tick a box, so it is surprising that there are not more
categories. $10–$20 is quite a large range and might have been subdivided into much
smaller groups.
h This question has very doubtful relevance. There are so many different grounds of
comparison that a yes–no response is far too limiting. Each respondent probably
appreciates school for the same reasons, e.g. gaining qualifications for the future,
socialising with friends, and work for others, e.g. employment experience, income.
Exercise 12.3
2 a Would you expect to get honest, reliable answers if you conducted a face to face
interview with students and asked ‘do you smoke cigarettes?’ or ‘do you drink
alcohol?’ Why?
b What would be a good way to conduct such a survey?
4 Explain what is wrong with this survey question, which is directed at a group of 12 year
olds. ‘In what ways will the GST have a detrimental effect on the business community and
on private citizens in the lower socio-economic bracket?’
■ Consolidation
5 In each of the following cases comment on the relevance of each question, and on its nature
and type.
a A survey is intended to assess the preparedness of households in dangerous areas to
withstand bushfires.
i Do you own a fire extinguisher? Yes/No
ii How many hoses do you have? ____
iii How far from your house is the nearest heavy bush?
less than 5 m 5–14 m 15–30 m more than 30 m
iv What penalties do you believe should be imposed for the deliberate lighting of
bushfires?
_________________________________________________________________
440 Mathscape 8
7 Which of the following survey questions are inappropriate due to the use of emotive
language?
a ‘Should greedy petrol companies be forced to stop raising the price of petrol?’
b ‘Should cruel and violent sports such as fox hunting and duck shooting be banned?’
c ‘Should banks be required by law to maintain at least one branch in every country
town?’
d ‘Should irresponsible parents who leave their children in hot cars while they gamble
away all their money at the casino face police charges?’
e ‘Should people who have been unemployed for at least 12 months be required to work
for their unemployment benefits?’
f ‘Should selfish people who smoke in restaurants and ruin other people’s meals be
banned?’
■ Further applications
8 Form a small group and choose one of the following focus questions, or make up your own
question. Your task is to create a full questionnaire that would provide you with enough
information to draw a valid conclusion. You should concentrate on preparing high-quality
questions that do not suffer from the weaknesses that have been discussed in this exercise.
• Would a speciality take-away food shop offering high-quality, diet-conscious items
(low fat, low sugar, low salt), using only natural, unmodified ingredients be successful
in your local area?
• Are female drivers in the age range 18–21 years safer drivers than males in the same age
group?
• Are the majority of students in Years 11 and 12 well-informed about the entry
requirements for their chosen university courses?
• Do teenage surfers use sufficient sun-screen lotion at the beach?
• Would it be desirable to invent a new sport that incorporates many of the good features
of existing sports but few if any of the bad features?
• Do students at this school get too much homework/too many assignments?
Note: The random number function on a calculator generates decimals with 3 decimal places.
These decimals can be multiplied by 1000 in order to obtain random integers.
Note: The Fill Right command can be used together with the Fill Down command to produce
a table of random numbers.
Note: If the end of a row or column is reached, continue by moving to the next row or column.
Chapter 12: Data analysis and probability 443
Example 1
EG Use the table of random numbers above to write down 5 two-digit numbers. Begin with the 4
+S in row 6 column 13 then move:
a to the right b to the left c upward d downward
Solutions
a The numbers are 46, 93, 12, 97, 74. b The numbers are 46, 20, 50, 77, 84.
c The numbers are 46, 73, 57, 29, 30. d The numbers are 46, 21, 40, 10, 55.
Note: In b, 05 or 5 is not a two-digit number, and in c, 08 or 8 is not a two-digit number.
Example 2
EG Use the random number function on a calculator to create a list of 5 random two-digit
+S numbers.
Solution
Press the RAN or RAND key 5 times and take the last two digits in the display each time as
the numbers. For example, if the display shows 0.638, 0.124, 0.337, 0.965, 0.027, then the
random numbers would be 38, 24, 37, 65, 27.
Example 3
EG Show how a spreadsheet could be used to generate a list of 10 numbers between:
+S
a 1 and 60 b 30 and 50
Solutions
Go to Format in the toolbar, click on Cells, then click on Number, and set the cells to
0 decimal places.
a Type = RAND ( ) * 60 then select Fill Down and fill down 10 cells.
b Type = RAND ( ) * (50 − 30) + 30 then fill down 10 cells.
Exercise 12.4
Use the table of random numbers on page 442 to answer questions 1–5.
1 Beginning with the 5 in row 6, column 9, write down 5 one-digit numbers by moving:
a to the right b to the left c upward d downward
2 Beginning with the 6 in row 5, column 21, write down 5 two-digit numbers by moving:
a to the right b to the left c upward d downward
3 Beginning with the 1 in row 6, column 17, write down 5 three-digit numbers by moving:
a to the right b to the left c upward d downward
4 Beginning with the 3 in row 4, column 25, write down 10 two-digit numbers by moving:
a to the right b to the left c upward d downward
444 Mathscape 8
■ Consolidation
5 a Start at the 2 in row 1, column 5, then move to the right to find 10 two-digit numbers
that are less than 60.
b Start at the 7 in row 4, column 13, then move downward to find 10 three-digit numbers
that are greater than 400.
6 Use the random number key on your calculator to generate 5 random numbers that have:
a 1 digit b 2 digits c 3 digits
7 Using the random number function on your calculator, simulate the ages of 10 people
between 1 year and 50 years. Record the last two digits of any random numbers that fall in
the desired range.
8 Harriet has 85 friends she would like to invite to her wedding reception; however, she can
only afford to invite 25 people. To avoid offending anyone, she decided to choose the
names at random. Here is the method that Harriet used.
• Assign each person a number from 1 to 85.
• Use the random number function on a calculator to generate 25 random numbers.
• Multiply each number by 85, then round up to the nearest whole number.
• Invite the 25 people whose names correspond to these numbers.
Did each person on the list have an equal chance of being selected?
10 Use the Fill Down and Fill Right commands to create a table of 5-digit numbers. Your
table should have 10 rows and 40 columns (i.e. 8 groups of 5-digit numbers).
■ Further applications
13 Of the 2000 students at a particular university, 8 were to be chosen at random to travel
overseas and attend an international forum on ways to increase worldwide access to
education. The vice-chancellor assigned each of the students a number from 1 to 2000,
then used the random number function on his calculator to generate random numbers,
multiplying each number by 2000. Does each student at the university have an equal chance
of being selected? Explain your answer.
Chapter 12: Data analysis and probability 445
Example 1
EG In each of the following, state whether the sample chosen is a simple random sample, a
+S stratified random sample or a systematic random sample.
a Every 50th bottle of cola produced in a drink factory is selected and tested for sweetness,
fizz and general taste.
b The principal knows that there are 411 girls at her school. She writes each girl’s name on
a separate piece of paper, places the names in a barrel, spins the barrel, and selects
20 names.
c From a group of 500 police officers, 300 fire fighters and 200 ambulance paramedics, a
committee is formed to enquire into the response times of emergency services personnel.
A committee of 10 police officers, 6 fire fighters and 4 ambulance paramedics are chosen
randomly from their respective services.
446 Mathscape 8
Solutions
a The sample is a systematic random sample because the bottles are chosen at equally
spaced intervals.
b The sample is a simple random sample because the names are chosen at random with each
girl having an equal chance of being selected.
c The sample is a stratified random sample because the members of the committee are in the
same proportion as those in the population. (1 person in every 50 has been chosen from
each group.)
Example 2
EG A senate committee of 20 politicians is to be formed from members of the Labor and Liberal
+S parties. The number of representatives from each party is to be in the same proportion as the
number of seats that each party holds in the senate. If Labor holds 64 seats and the Liberals
hold 44 seats, how many members of each party should be on the committee?
Solution
64 44
No. of Labor members = --------- × 20 No. of Liberal members = --------- × 20
108 108
= 11.9 (1 decimal place) = 8.1 (1 decimal place)
∴ The committee should consist of 12 Labor members and 8 Liberal members.
Exercise 12.5
1 Explain how a simple random sample of 2 people could be obtained from a group of
6 people by using a die.
2 At a school fete, 30 tickets are sold in a raffle. A wheel with the numbers from 1 to 30 is
then spun to determine the prize winner. What kind of sampling technique does this
illustrate?
3 Karen bought 5 tickets in a lottery in which 100 000 tickets are sold.
a What kind of sampling technique is used to determine lottery winners?
b What are Karen’s chances of winning first prize?
4 Gina won 6 concert tickets in a competition and decided to select at random 5 friends from
work to go along with her. To do this, Gina assigned each of her workmates a number from
1 to 42, then randomly selected a line of numbers from a table of random numbers.
Beginning at the first digit, Gina selected the numbers of the 5 people who would go to the
concert with her.
93702 78650 58027 91297 73438 75819 66753 41102 64473
a Find the numbers of the people who were selected.
b Was the procedure that she followed fair?
Chapter 12: Data analysis and probability 447
■ Consolidation
5 Owing to a significant fall in profits, the owner of a small business had to lay off 6 of his
36 employees. He decided that the 6 people would be chosen randomly by placing their
names in a grid as shown. A die would then be rolled twice to give a pair of co-ordinates.
The first number rolled gives the column number and the second number rolled gives the
row number. The employee whose name appears at those co-ordinates would be laid off.
6 Emma David Chris Jisun Wendy Bob
5 Chandra Kim Richard William Tarita Nigel
4 Elise Caitlyn Trent Navarre Lionel Jenna
3 Laurie Ursula Mikhael Len Roger Merv
2 Amber Taleisha Yee Ling Emille Yuri Patricia
1 Thierry Doug Celestine Perry Nicholle John
1 2 3 4 5 6
The numbers rolled in order were: 4 3 5 2 1 6 2 5 4 1 6 4.
a Write down the co-ordinates of the 6 employees.
b Which employees were laid off?
c Did each employee have an equal chance of being selected?
6 A factory produces 420 mouse traps every day. Find the number of traps tested if quality
control tests a systematic sample of 1 trap in every:
a 70 b 60 c 35 d 15
7 If 6000 computer chips are produced each day on a production line, find at what intervals
the manufacturer should test the chips in a systematic sample of:
a 120 b 75 c 50 d 15
8 Explain how we could choose a systematic sample of 25 people from a group of 350 people.
9 Every 25th light globe that passes along a production line is removed and its brightness
tested. If 4000 light globes are produced each day, find the number of globes that will be
tested.
10 The Student Representative Council at Pascal High has assigned each of the 120 Year 12
students at the school a number from 1 to 120. The Council wants to obtain a systematic
sample of 15 students from the year group in order to conduct a survey. Which students
would be selected if the first student selected is number 65?
11 Jan is a telesales consultant who wishes to choose a sample of 200 people from the
residential section of the Sydney White Pages telephone directory in order to conduct
a marketing survey.
a How could she choose the names in such a way that the sample is unbiased?
b How could she choose a biased sample?
448 Mathscape 8
12 In a group of 1200 workers, 700 are male and 500 are female. A stratified sample of 60
workers is to be selected based on gender.
a How many males should be selected?
b How many females should be selected?
13 A primary school has 250 girls and 150 boys. How many boys and girls should be chosen
in a stratified sample of:
a 80 children? b 48 children? c 32 children?
14 Of the 132 employees at a building site, 110 are blue-collar workers and the rest are
white-collar workers. A survey is to be conducted among the workers to gather information
about safety conditions on the site. How many blue-collar and white-collar workers should
be chosen in a stratified sample of 24 workers?
15 The table below shows the number of people in each age group at a nursing home. A group
of 25 residents is to be chosen, using a stratified sample based on age group, to enquire into
the needs of the residents.
Age group (years) 70–74 75–79 80–84 85–89 90 +
Number of people 15 24 21 12 3
a How many people live in the nursing home?
b How many people from each age group should be chosen in the sample?
16 The table below shows the number of students in each group at Rocky Mountain High. The
assistant principal wants to conduct a survey into the use of technology in the school by
using a stratified sample of 100 students.
Year group 7 8 9 10 11 12
Students 155 172 167 143 130 109
a How many students attend the school?
b How many students from each year group should be chosen to participate in the survey?
18 In a sports club raffle with 20 prizes, 250 red tickets, 150 blue tickets and 100 green tickets
were sold. How many prizes would you expect to be won by someone holding:
a a red ticket? b a blue ticket? c a green ticket?
19 A soft drink company produced 54 000 bottles of lemonade last month. Every 300th bottle
was removed from the production line and tested for taste, sweetness and fizz.
a How many bottles of lemonade were tested?
b If 3 bottles in the sample failed the quality assurance test, find the number of bottles
produced last month that were below standard.
■ Further applications
21 A bag contains 81 black marbles and a number of white marbles. Stuart chose 30 marbles
from the bag without replacement, of which 12 were white. How many marbles were
originally in the bag?
22 Comment on any bias in each of these simple random samples.
a The names of 20 people are written on separate pieces of paper, which are folded and
placed in a bag. Five names are drawn out.
b Twelve people are each assigned a number from 1 to 12. Two dice are then rolled to
determine the names of 3 people who are to move to a new air-conditioned office.
c A number from 1 to 99 is assigned to each person in a group. A coin is then tossed to
select the employees who will have to work on New Year’s day. If the coin shows
heads, then the odd-numbered employees will have to work. If the coin shows tails, then
the even-numbered employees will have to work.
450 Mathscape 8
This definition of the mean can also be written using the Greek letter Σ (pronounced sigma),
where Σ means ‘the sum of’.
∑x
x = -------- where • x is the mean
n
• Σx is the sum of the scores
• n is the number of scores
The mean does not have to be one of the scores.
The total Σf at the base of the frequency column represents the number of scores, while the total
Σfx at the base of the fx column represents the sum of the scores.
Note: For some calculators, multiple scores are entered by pressing score 2ndF ’
frequency M+ .
Exercise 12.6
1 Find the mean of each set of scores, correct to 2 decimal place where necessary.
a 7, 3, 12, 9, 4 b 6, 12, 11, 4, 5, 10 c 14, 25, 32, 29
d 5, 18, 12, 41, 34 e 84, 37, 91, 66, 54 f 8, 1, 3, 2, 4, 3, 6, 5
g 4.3, 0.9, 6.2, 3.1, 5.7, 8.8 h 17.2, 15.9, 11.4, 3.6, 19.7 i -2, 6, -8, 1, -3, -12
■ Consolidation
3 Copy and complete each frequency distribution table, then find the mean, correct to
2 decimal places.
a Score Frequency fx b Score Frequency fx
(x) (f ) (x) (f )
1 4 15 3
2 1 16 4
3 6 17 7
4 2 18 5
5 7 19 2
Σf = Σfx = Σf = Σfx =
4 Use your calculator to find the mean of each data set, correct to 1 decimal place.
a Score 1 2 3 4 5 b Score 4 5 6 7 8
Frequency 12 15 10 19 22 Frequency 30 25 15 40 35
c Score 20 21 22 23 24 d Score 35 36 37 38 39
Frequency 52 38 90 46 61 Frequency 22 24 27 23 21
e Score 48 49 50 51 52 f Score 10 11 12 13 14
Frequency 60 20 80 70 40 Frequency 6 18 21 17 13
Chapter 12: Data analysis and probability 453
Frequency
Frequency
5 8
4 6
3 4
2 2
1 0
0 1 2 3 4 5 6
2 4 6 8 10 Score
Score
c d
7 16
Frequency
6 12
Frequency
5 8
4 4
3 0
2 5 10 15 20 25
Score
1
0
15 16 17 18 19 20
Score
40 41 42 43 44 45 46 47 48
Score
454 Mathscape 8
8 The weekly wages of four tradesmen are $580, $610, $595 and $615. Find their mean
weekly wage.
9 Last season a netball player scored the following number of goals per game before getting
injured. What was her goal average per game?
7 9 6 9 5 10 3
10 The carriages on an inter-city train were inspected for signs of vandalism. The number of
seats that were damaged in each carriage are listed below. Find the average number of
damaged seats per carriage.
4 3 5 0 10 8
13 a The mean of a set of scores is 15 and the sum of the scores is 270. Find the number of
scores.
b A set of scores has a mean of 58.2 and a sum of 1280.4. How many scores are there?
14 a The mean of 4 scores is 27. If three of the scores are 18, 29 and 43, find:
i the sum of the scores ii the 4th score
b The mean of 5 scores is 16. If four of the scores are 15, 19, 32 and 10, find:
i the sum of the scores ii the 5th score
Chapter 12: Data analysis and probability 455
15 a Expressed as percentages, Nikki’s science test results so far this year are 62, 78, 65, 71
and 66. What mark must she score on the final test in order to have an average of 70%
for the year?
b The daily profits for Monday to Thursday for a bookseller are $85, $92, $80 and $96.
What must his profit be on Friday in order to raise the daily profit average to $90 for
that week?
c At the start of a new cricket season, Jacques scored 42, 39, 75, 10 and 62 in his first five
innings. How many runs must he score in the next innings to raise his batting average
to 50 runs per innings?
■ Further applications
16 The data below shows the mass in kilograms of 40 parcels that are to be sent by rail freight
to various destinations within NSW.
21 13 9 19 14 23 7 28 12 12
4 25 17 26 8 14 23 18 9 13
15 16 22 3 29 19 17 11 14 20
16 25 8 7 2 28 15 12 10 15
a Complete this frequency Class Class Tally Frequency fx
distribution table for the given centre (x) (f )
data.
b Why would it be inappropriate 1–5 3 ||| 3 9
to tabulate this data as individual 6–10 8
scores?
11–15
c What does Σfx represent in a
grouped data distribution table? 16–20
d How will this affect the 21–25
calculation of the mean?
26–30
e Find the approximate mean,
correct to 1 decimal place.
17 a The mean of a set of 7 scores is 15. Find the new mean when a score of 23 is added to
the set.
b The mean of a set of 36 scores is 17.5. Find the new mean when a score of 110 is added
to the set.
c The mean of a set of 21 scores is 12. Find the new mean, correct to 1 decimal place,
when a score of 42 is added to the set.
d The mean of a set of 15 scores is 7.8. Find the new mean, correct to 1 decimal place,
when a score of 26 is added to the set.
18 a The mean of a set of 11 scores is 27. After a new score is added, the mean falls to 25.
Find the score that was added.
b The mean of a set of 18 scores is 14. After one of the scores is taken out, the mean falls
to 12. Find the score that was taken out.
19 The mean of 4 consecutive scores is 18.5. Form an equation and solve it to find the scores.
456 Mathscape 8
When a set of scores have been arranged in ascending order, the median is:
the middle score if there is an odd number of scores
the average of the two middle scores if there is an even number of scores.
The median is not difficult to find if the number of scores is small. However, to find the median
of a large number of scores, as might be the case in a frequency distribution table, histogram or
polygon, we use the following rules.
When a set of n scores have been arranged in ascending order, the median is:
the ⎛ ------------⎞ th score if n is odd
n+1
⎝ 2 ⎠
the average of the ⎛ ---⎞ th and ⎛ --- + 1⎞ th scores if n is even.
n n
⎝ 2⎠ ⎝2 ⎠
Example 1
EG Find the median of each set of scores.
+S
a 3, 5, 8, 10, 17 b 2, 5, 7, 11, 12, 15
Solutions
a There is an odd number of scores and 8 is the middle score. Therefore, the median is 8.
b There is an even number of scores and the two middle scores are 7 and 11.
7 + 11
The median = ---------------
2
=9
Example 2
EG Where does the median lie in a set of:
+S
a 49 scores? b 110 scores?
Solutions
a The number of scores is odd, ∴ the median is the ⎛ ------------⎞ th score, where n = 49.
n+1
⎝ 2 ⎠
That is, the median is the ⎛ ---------------⎞ th, or 25th score.
49 + 1
⎝ 2 ⎠
b The number of scores is even, ∴ the median lies between the ⎛ ---⎞ th and ⎛ --- + 1⎞ th scores,
n n
⎝ 2⎠ ⎝2 ⎠
where n = 110. That is, the median lies between the ⎛ ---------⎞ th and ⎛ --------- + 1⎞ th, or 55th
110 110
⎝ 2 ⎠ ⎝ 2 ⎠
and 56th scores.
Chapter 12: Data analysis and probability 457
Example 3
Score Frequency
EG Find the median score in this frequency distribution table.
+S 1 12
Solution 2 9
There are 99 scores, which is odd. Therefore the median 3 23
lies in the ⎛ ---------------⎞ th position.
99 + 1
⎝ 2 ⎠ 4 40
That is, the median is the 50th score. To find which score 5 15
is the 50th, simply add the frequencies one at a time from Σf = 99
top to bottom as follows.
Score Frequency
The first 12 scores are 1s
1 12
2 9
) + → 12 + 9 = 21, ∴ the 21st score is a 2.
3 23
) + → 21 + 23 = 44, ∴ the 44th score is a 3.
4 40
) + → 44 + 40 = 84, ∴ the 84th score is a 4.
Now, the 44th, 45th, 46th, . . . 84th scores
5 15 are all 4s, including the 50th score.
Σf = 99 Therefore the median is 4.
Exercise 12.7
1 Find the median in each set of scores for which the number of scores is odd.
a 2, 5, 9, 10, 16 b 1, 2, 2, 4, 5, 7, 8 c 3, 5, 5, 7, 8, 9, 9, 9, 14
2 Find the median in each set of scores for which the number of scores is even.
a 5, 6, 10, 13 b 3, 4, 4, 8, 9, 11, 11, 15 c 2, 3, 3, 3, 4, 7, 7, 9, 9, 10
3 Arrange these scores in ascending order, then find the median. Each set has an odd number
of scores.
a 7, 4, 6, 1, 3 b 25, 21, 29, 26, 3
c 14, 18, 21, 24, 19, 16, 30 d 8, 13, 5, 19, 20, 7, 11
e 5.2, 7.6, 8.3, 0.9, 6.3 f 11.2, 19.5, 1.4, 16.6, 17.9
4 Arrange these scores in ascending order, then find the median. Each set has an even number
of scores.
a 8, 11, 3, 6 b 17, 14, 15, 10, 16, 19
c 23, 26, 18, 25, 28, 15 d 8, 2, 6, 1, 8, 3, 9, 4
e 37.5, 42.7, 24.9, 55.1 f 5.9, 3.1, 5.9, 8.2, 1.8, 9.5
458 Mathscape 8
■ Consolidation
5 Find the position of the median in a set of:
a 13 scores b 27 scores c 45 scores
d 59 scores e 81 scores f 117 scores
7 How many scores are there in a distribution if the number of scores is odd and the median
lies in the:
a 10th position? b 26th position? c 47th position? d 93rd position?
8 How many scores are there in a distribution if the number of scores is even and the median
lies between the:
a 6th and 7th scores? b 15th and 16th scores?
c 34th and 35th scores? d 71st and 72nd scores?
e x f f x f g x f h x f
1 3 10 3 41 16 94 13
2 5 11 8 42 9 95 17
3 1 12 7 43 5 96 20
4 4 13 6 44 8 97 21
5 5 14 5 45 10 98 16
6 2 15 7 46 2 99 9
10 Arrange these scores into a frequency distribution table, then find the median.
23 26 25 23 26 25 21 27 24 22
26 25 25 22 24 26 21 23 25 21
27 22 25 24 26 26 21 25 22 23
Chapter 12: Data analysis and probability 459
Frequency
Frequency
5 4
4 3
3 2
2 1
1 0
0 7 8 9 10 11 12 13
33 34 35 36 37 38 Score
Score
8
6
60 61 62 63 64 65 66
4
Score
2
0
25 26 27 28 29 30
Score
14 A real-estate agent sold 8 houses last week with the following values. Calculate the median
house price.
• $240 000 • $299 000 • $206 000 • $355 000
• $312 000 • $268 000 • $425 000 • $194 000
15 Students were asked to line up in ascending order of height for a sporting photograph.
Which of these students has the median height?
• Alex: 158 cm • Gary: 160 cm • Terry: 153 cm • Tran: 149 cm
• Mikhael: 156 cm • Demir: 162 cm • Timothy: 155 cm
16 Cathy’s times for recent 400 m runs are as follows: 51.2 s, 50.8 s, 52.3 s, 49.7 s, 50.0 s,
48.6 s. What was her median time?
■ Further applications
17 The mean of 5 numbers is 14. If four of the numbers are 19, 10, 7 and 18, find the median.
18 Write down a set of 5 numbers in which the median is 9 and the mean is 11.
19 Write down a set of 6 numbers in which the median and the mean are equal in value.
Example 1
EG Find the mode for each set of scores, where possible.
+S
a 5, 7, 8, 8, 8, 11 b 3, 3, 5, 5, 7, 10, 16 c 1, 3, 5, 7, 9
Solutions
a The score of 8 occurs three times, which is more often than the other scores, which only
occur once. Therefore, the mode is 8.
b There are two 3s and two 5s, which is more often than the other scores, which only occur
once. Therefore, the modes are 3 and 5.
c Each score occurs once, so there is no score that occurs more than all the others. Therefore,
there is no mode.
Chapter 12: Data analysis and probability 461
Example 2 Solution
EG Find the range of 7, 15, 21, 33, 45, 60. Range = highest score − lowest score
+S = 60 − 7
= 53
Example 3
EG For the scores in this frequency distribution table, find:
+S
a the mode b the range
Score 15 16 17 18 19 20 21
Frequency 6 9 10 14 12 11 4
Solutions
a The score with the highest frequency is 18, so 18 is the mode.
b Range = highest score − lowest score
= 21 − 15
=6
Exercise 12.8
Frequency
40 16
30 12
20 8
10 4
0 0
2 4 6 8 10 1 2 3 4 5 6 7
Score Score
6 On a given day, the sales assistants at Sam’s Shoe Store sold shoes of the following sizes.
8 12 8 10 11 9 8 10 9 12
12 11 8 8 10 9 9 8 11 7
10 8 7 9 12 10 7 8 11 8
7 8 11 10 12 8 8 9 10 8
a Organise the data into a frequency distribution table with score, tally and frequency
columns.
b Find the modal shoe size.
c Find the range of shoe sizes.
7 For one month Robyn recorded the number of hours she spent each weeknight doing
homework. The results are recorded below.
2 5 1 4 3 0 1 4
4 1 2 3 2 1 4 2
0 2 1 5 0 2 3 3
a Express the data in the form of a dot plot.
b Find the modal homework time.
c Determine the range of these homework times.
■ Further applications
The score that lies halfway between the lowest score and the median is called the lower
quartile. Similarly, the score that lies halfway between the median and the highest score is
called the upper quartile. The difference between the upper quartile and the lower quartile
is called the inter-quartile range.
8 For each set of scores below:
i arrange the scores in ascending order ii find the median
iii find the upper and lower quartiles iv find the inter-quartile range
a 8, 12, 10, 14, 9 b 10, 7, 5, 22, 29, 21, 14, 19, 24
c 33, 26, 43, 22, 24, 29, 28, 27, 39 d 17, 20, 31, 23, 19, 27, 15
Chapter 12: Data analysis and probability 463
Example 1
EG Two Year 8 students are competing with each other to see who will represent the school
+S in a statewide Mathematics competition. The table below shows the test and exam scores
(as percentages) of each student throughout the year.
Test 1 Test 2 Test 3 Exam Test 4 Test 5 Test 6 Exam
Kate 92 89 96 82 90 95 82 86
Victoria 90 90 79 84 95 97 83 82
a Find the mean, median, mode and range of each student’s scores.
b Which student should be chosen to represent the school? Why?
Solutions
This table shows the mean, median, mode and range of each student’s scores.
Mean Median Mode Range
Kate 89 89.5 82 14
Victoria 87.5 87 90 18
b Kate has the higher mean and median scores. Victoria has the higher mode and the highest
score; however, she also has the lowest score. Kate’s results are more consistent overall,
so she should be chosen to represent the school.
464 Mathscape 8
Exercise 12.9
1 The table shows the test scores for two boys Dennis and Han over the past year.
Test 1 2 3 4 5 6 7
Dennis 65 82 88 79 88 84 81
Han 83 74 66 76 83 93 64
a i Who scored the highest mark during the year?
ii Who scored the lowest mark?
b In how many tests did Dennis score a higher mark than Han?
c Which student has the higher total?
d For each student, find the:
i mean ii median iii mode
e Which student was the more consistent? Why?
2 The table shows the average maximum monthly temperatures (C°) in Hobart and Sydney.
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Hobart 21.9 21.5 20.1 17.4 14.4 12.1 11.8 13.0 15.2 16.8 18.3 20.1
Sydney 25.8 25.6 24.6 22.3 19.2 16.7 16.1 17.6 19.7 21.9 23.6 25.1
a Find the mean maximum temperature for each city.
b Find the median maximum temperature for each city.
c In which month are the maximum temperatures closest?
d Which city has the warmer climate overall?
■ Consolidation
3 The area chart shows both the annual profits and the costs of the CPC Pastry Company from
1994 to 2003.
Profit/cost margin at CPC Ltd
800
700
Profit/costs (× $1000)
600
500
Annual profit
400
Annual costs
300
200
100
0
1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
Year
a Use the graph to estimate the company’s annual profit in i 1994 ii 2003.
b Use the graph to estimate the company’s annual costs in i 1994 ii 2003.
Chapter 12: Data analysis and probability 465
4 This bar graph shows the number of customers who visited some of the departments at the
Grace Jones and David Bros department stores on the first day of the mid-year sales period.
Department store customers
1054
Ladieswear
1125
Menswear 775
830
Department
Furniture 610
564
Electrical 362
goods 335
Grace Jones
243
Kitchenware 290 David Bros
Manchester 191
228
0 200 400 600 800 1000 1200
Number of customers
a Which department in each store had the most customers?
b How many more customers visited the kitchenware section at David Bros than at
Grace Jones?
c Is it appropriate to try and find the mean or median departments visited? Why?
d Which store had the most customers in total and by what margin?
e What percentage of customers overall shopped at David Bros?
5 The side-by-side column graph shows the number of flights flown per day by Kangaroo
Airlines and Wombat Air.
Comparison of airline flights
20
16
Flights per day
12 Kangaroo Airlines
Wombat Air
8
0
Mon Tue Wed Thu Fri Sat Sun
Day of the week
466 Mathscape 8
a Which airline and on which day does one company have 11 flights?
b Which airline has more flights on the weekend?
c What is the minimum number of flights that take place on any one day?
d On how many days does Kangaroo Airlines have more than 14 flights?
e How many more flights does Wombat Air have on Wednesday than Kangaroo Airlines?
f On which days does Wombat Air have more flights than Kangaroo Airlines?
g Calculate the mean number of flights per day for each airline.
6 This stem-and-leaf plot shows the end of year exam results for two Year 8 Mathematics
classes. There are 24 students in each class.
8 Green 8 Gold a Write down the highest and lowest scores for
each class.
8 4 79 b What is the range for each class?
630 5 12458 c Find the mean, median and mode for each class.
7521 6 356789 d Estimate the combined mean of the two classes.
964222 7 01446 e Calculate the combined mean if the exam scores
887432 8 4568 in 8 Green have a sum of 1793 and the scores in
6530 9 07 8 Gold have a sum of 1659.
f Which class performed better on the exam?
Why?
7 The ages of the men and women who took part in a radio music survey are shown in this
back-to-back stem-and-leaf plot.
a How many: Women
Men
i women were surveyed? (0)
ii men were surveyed? 10 2 122234
(5)
b Write down the age of the: 877 2 58899
(0)
i youngest woman ii youngest man 4332 3 01233
iii oldest woman iv oldest man (5)
986 3 6678
c Find the mode for each set of data. 3 2 2 0 4(0) 1 2 2 3 4
d What are the age ranges of these men and
8 8 8 7 6 5 5 4(5) 5 6 7 7
women?
e How many men are less than 40 years of age? 4 4 2 1 1 5(0) 1 3 4
f How many women are 50 years of age or more?
g Find the median ages of the men and the women.
h Calculate the mean ages of the men and the women.
8 The data below shows the waiting times per customer at two different banks between
11:00 am and 1:00 pm on Thursday.
Northpac Bank International Bank
14 7 12 11 7 8 9 11 12 10 9 12 12 7
11 6 21 13 3 10 7 10 12 13 15 22 9 6
12 13 11 6 8 11 9 5 14 14 9 8 7 9
11 8 12 9 14 9 13 11 9 12 15 12 11 9
10 7 5 13 20 7 8 11 13 9 13 7 13 14
Chapter 12: Data analysis and probability 467
■ Further applications
Jan
9 This radar chart shows the average monthly Dec 160 Feb
rainfall in millimetres in Sydney and Brisbane. 120
a Which city experiences more rain in Nov 80 Mar
i summer? ii winter? 40
b During which 3 months do both cities Oct 0 Apr
experience about the same amount of
rainfall?
c Which city has the wettest month? Sep May
d Estimate the rainfall in:
i Sydney in April Aug Jun
Jul
ii Brisbane in June Sydney
e Approximately how much more rain falls in Brisbane
Brisbane than in Sydney during November?
Example 1
EG Explain why the mode is not a good average for the scores 2, 4, 7, 11, 15, 17.
+S
Solution
Each score occurs once only, hence there is no clear mode.
468 Mathscape 8
Example 2
EG Explain why the median is not a good average for the scores 12, 12, 12, 12, 14, 19, 21.
+S
Solution
The median is 12, which is one of the extreme or end scores. Therefore it is not a good average.
Example 3
EG Explain why the mean is not a good average for the scores 3, 18, 19, 21, 24, 26.
+S
Solution
The low score of 3, called an outlier, is so far from the other scores that it would have too great
an influence on the mean, which would then not be truly representative of the scores overall.
Exercise 12.10
1 In which of the following is the mean not a good average for each set of scores?
a 1, 2, 3, 5, 6 b 8, 8, 9, 12, 51 c 2, 23, 24, 24, 25, 27
d 17, 18, 20, 23, 25 e 45, 47, 48, 74 f 3.2, 3.3, 3.6, 3.6, 8.5
2 In which of the following is the median not a good measure of a typical score?
a 4, 4, 5, 7, 8 b 9, 9, 9, 9, 15, 17, 18 c 20, 21, 25, 27
d 3, 4, 4, 6, 6, 6, 6 e 12, 12, 12, 16 f 4.3, 4.4, 4.6, 4.6, 4.7, 4.9
3 In which of the following is the mode not a good measure of a typical score?
a 1, 2, 4, 5, 7, 8 b 10, 11, 11, 11, 13, 15 c 2, 2, 2, 2, 4, 5, 5
d 40, 41, 42, 42, 46 e 32, 33, 34, 36, 38, 39 f 7.1, 7.2, 7.3, 7.3, 7.3, 7.7
■ Consolidation
4 Each week, Tamiko’s maths teacher gives the class 15 quick questions for which the
students cannot use a calculator. Tamiko’s results so far this year are:
13 14 15 12 11 5 12 13
11 15 12 14 14 13 14 15
Which measure of location would be:
a the least appropriate? b the most appropriate?
5 During January, the MacMillan real-estate agency sold 8 houses at auction. The sale prices
were:
$225 000 $260 000 $260 000 $270 000
$282 000 $284 000 $297 000 $630 000
a Find the mean, median and modal auction prices.
b Which measure of location would be the fairest to use as the average house price?
c The agency advertised that its average sale price for the month was $313 500. Is this a
fair statement? Explain your answer.
Chapter 12: Data analysis and probability 469
7 While speaking to a group of parents who were considering enrolling their daughters in
the school, the principal stated that ‘the average number of students per class at this school
is 16.2’.
a Could this average be the mode? Explain your answer.
b Which measure of location do you think the principal chose to use? Why?
8 In an international diving competition, there are 7 people on the judging panel. Each person
gives the diver a score out of 10, correct to 1 decimal place. The highest and lowest scores
are removed and the remaining scores are averaged to calculate the actual score for the dive.
a How do we know that the median is not being used to calculate the diver’s score?
b Why would it be difficult to use the mode here?
c Why are the highest and lowest scores not used?
10 A bank manager examined the data on the number of people who entered the bank in each
hourly time slot during the day. Would he be more concerned with the mean, the median or
the mode of this data? Why?
■ Further applications
11 Which measure of location gives the fairest description for each of these distributions? Why?
a b
1 2 3 4 5 6 7 1 2 3 4 5 6 7
c d
1 2 3 4 5 6 7 1 2 3 4 5 6 7
470 Mathscape 8
You will need to work in small groups of 3–5 students to complete this activity.
Having completed the topics on data representation, analysis and evaluation, you should
now be able to use the skills that you have learned to conduct a statistical investigation of
your own. You should try to follow the steps below when conducting your investigation.
1 Pose a key question that you want to investigate.
• Check that your question is clear, easy to understand and unambiguous.
• Define carefully all of the terms in the question that you consider are important or
relevant to the investigation.
• Make a list of any sources that you expect to use to obtain information. This will of
course be expanded as the investigation takes shape.
• Determine whether the key question needs to be refined in any way as a result of
your planning.
2 Decide on the methods or techniques that should be used to collect the data.
• Consider any possible difficulties in obtaining data from the entire population.
Would it be more appropriate to use a census or a sample?
• If a sample is to be chosen, decide on the sampling technique that should be used—
systematic, stratified or simple random sampling.
• Ensure that any sample is randomly chosen, of a suitable size and free from bias.
• Determine whether the data should be collected by the use of a pen and paper
questionnaire, a face-to-face interview, a tape recorder, a school Intranet survey
or observation.
• Should published data be used? If so, ensure that it is reliable and not out of date.
3 Organise and display the data.
• Look for any errors that may have occurred during the collection process.
• Decide on the most appropriate ways to display the data. This may involve tables,
graphs or scatter diagrams.
• Ensure that all tables and graphs are clearly labelled.
4 Analyse the data.
• Consider the effect of any outliers.
• Use measures of location (mean, median, mode) and the range, where appropriate,
to analyse the data.
Chapter 12: Data analysis and probability 471
In this exercise we will use certain terms to describe the likelihood of a particular event
occurring. In the following exercise, numbers will be used in order to obtain a more precise
description of a probability.
Example 1
EG Describe the likelihood of each of the following events occurring. Solutions
+S
a scoring 101% on a Maths test a Impossible
b tossing a head with a coin b Even chance
c rolling a number from 1 to 6 on a die c Certain
d a music store will stock a Beatles CD d Very likely
e you will eventually become Prime Minister e Very unlikely
f at least half of your class will eventually get married f Likely
g a number greater than 4 is rolled on a die g Unlikely
Example 2
EG Comment on the following statements.
+S
a The first three children born in a family were boys. The next child is sure to be a girl.
b Only two people can play each other in a game of chess. Therefore, it is equally likely that
each person could win the game.
Solutions
a The statement is not correct. The probability that the child is male or female is not affected
by the sex of previous children. Therefore the next child born is just as likely to be a boy
as a girl.
b The statement is not correct. Chess is essentially a game of skill, not a game of luck. The
ability of both players must be taken into account when considering the likelihood that
either player will win.
Chapter 12: Data analysis and probability 473
Exercise 12.11
1 Describe the likelihood of each event occurring as either impossible (I), unlikely (U),
likely (L) or certain (C).
a Christmas Day will fall on December 25 this year.
b It will rain somewhere in Australia tomorrow.
c The sun will still be visible at midnight tonight in Sydney.
d An Australian golfer will win all 4 major championships this year.
e Every student in your class will score 100% on the next Maths test.
f A second moon of the same size will be discovered orbiting Earth.
g You will eventually buy your own house or home unit.
h A card is chosen from a regular pack of 52 playing cards and it is either red or black.
i The numbers 1 to 10 are written on separate pieces of paper, placed in a bag and the
number 11 is drawn out.
■ Consolidation
3 Describe the chance of each of the following events occurring as either impossible (I), very
unlikely (VU), unlikely (U), even chance (E), likely (L), very likely (VL) or certain (C).
a It will rain on the moon today.
b At least 5 students will be absent from your school on the next school day.
c Every student in your class likes brussel sprouts.
d A Maths exam will be held in NSW this year.
e A coin is tossed and the result is a tail.
f It will rain in Sydney some time during the next month.
g Someone will win first prize in Lotto next week.
h An odd number is showing when a die is rolled.
i A single card drawn from a pack is a King.
j At least one person you know has a pet.
k A woman is selected at random from a group of 20 men and 20 women.
l A person in your class is selected at random and they have a brother or sister.
m Your post code contains 4 digits.
n You have a computer in your home.
o An animal other than a horse will win the Melbourne Cup this year.
4 A bag contains 10 red balls and 10 white balls. Describe the probability of drawing:
a a blue ball b a red ball c a ball that is either red or white
5 20 tickets were sold for each spin of the chocolate wheel at a school fete. Use the terms in
Q3 to describe the winning chances of a person who has purchased the following number
of tickets?
a 3 b 16 c 7 d 20 e 10 f 13 g 0
474 Mathscape 8
6 Would you expect that the most likely outcome of an event would always occur? Explain
your answer.
7 Arrange the following events in order of probability, from the most likely to the least likely.
(Write A, B, C, . . .).
A the school principal will enter the classroom in the next minute
B the number 1, 2 or 3 will be spun on a spinner with the numbers 1 to 5 on it
C most students in your class are wearing a watch
D the next child born in Australia will be a boy
E your Maths teacher will not set homework for the next two school days
F in NSW, the sun will rise in the east tomorrow morning
G a green queen is chosen from a regular pack of 52 playing cards
■ Further applications
8 Explain the meaning of the following statements using the language of probability.
a Lightning never strikes twice.
b A particular greyhound is ‘odds on’ to win a race.
c There is a 50–50 chance that my injury will heal by the end of the week.
d I have Buckley’s chance of finishing my English assignment by Friday.
e There is a better than even chance that a patient will recover completely.
f A particular horse was a certainty beaten.
g My Maths teacher is only away once in a blue moon.
■ Experimental probability
In experimental probability, an experiment or a series of trials is conducted. The number of
times that each outcome occurs is recorded. These values are then divided by the total number
of possible outcomes to give the experimental probability for each outcome.
For example, a coin is tossed 100 times and shows heads on 53 occasions and tails on the other
47 occasions.
53 47
The experimental probabilities in this case are: P(head) = --------- and P(tail) = --------- .
100 100
These experimental probabilities may well be different if the coin was tossed another 100 times.
There is an element of chance in such experiments.
■ Theoretical probability
In theoretical probability, no trials are carried out. Instead, we theorise or think about the
probabilities that we would expect to find if such experiments were performed. For the same
example we would expect that the coin would show heads on exactly 50 occasions and tails on
the other 50 occasions. The theoretical probabilities in this case are:
50 50
P(head) = --------- and P(tail) = ---------
100 100
1 1
= --- = ---
2 2
If all outcomes are equally likely, then the theoretical probability that an event E
occurs is given by:
number of outcomes favourable to E
P ( E ) = ----------------------------------------------------------------------------------------------
total number of possible outcomes
Example 1
EG The numbers from 1 to 15 are written on separate pieces of paper, folded and placed in a bag.
+S One number is then drawn out at random. What is the probability that this number is:
a the number 13? b odd? c less than 12? d prime?
Solutions
The sample space is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15. It consists of 15 numbers
altogether.
a There is only one number 13. Therefore, P(13) = ----- 1
-.
15
b There are 8 odd numbers. Therefore, P(odd number) = ----- 8
-.
15
c There are 11 numbers less than 12. Therefore, P(number less than 12) = 11 ------ .
15
d A prime number has only 2 factors—itself and 1.
There are 6 prime numbers from 1 to 15. They are 2, 3, 5, 7, 11 and 13.
Therefore, P(prime number) = -----6
15
-
= 2
---
5
476 Mathscape 8
Example 2
EG A regular pack of 52 playing cards is shuffled and one card is drawn at random. Find the
+S probability that the card is:
a the 5 of hearts b black c a club
d a picture card e a red ace f a blue jack
Solutions
a There is only one 5 of hearts. b There are 26 black cards.
∴ P(5 of hearts) = -----
52
1
- ∴ P(black card) = 26
------
52
= 1
---
2
Exercise 12.12
3 A die is rolled. Find the probability that the number showing is:
a a6 b odd c a 2 or a 5
d less than 5 e more than 4 f greater than 6
4 A spinner for a child’s game has the numbers 1 to 9 written in sections of equal area.
Find the probability of spinning:
a the number 4 b an even number c an odd number
d a number less than 7 e a number not less than 3 f a two-digit number
5 A bag contains 5 red discs, 6 yellow discs and 4 green discs. One disc is drawn at random
from the bag. Find the probability that it is:
a red b green c yellow
d red or yellow e yellow or green f green or red
Chapter 12: Data analysis and probability 477
■ Consolidation
6 The letters of the word EQUILATERAL are placed in a bag and one letter is drawn out at
random. Find the probability of drawing:
a the letter T b the letter E c the letters A, Q or L
d a vowel e a consonant f a letter before R in the alphabet
7 Find, as a decimal, the probability that I will win first prize in a lottery in which 500 tickets
have been sold and I bought:
a 10 tickets b 25 tickets c 100 tickets d 120 tickets
8 In the last few days, a set of traffic lights at a particular intersection showed red on 130
occasions, green on 90 occasions and amber on 30 occasions. Using this data, find as a
percentage, the probability that the next car to reach this intersection will have:
a a red light b an amber light c a green light
10 The 30 students in 8 Red were surveyed to find out which languages they could speak other
than English. The results were:
• Italian, 9 • Mandarin, 5 • Arabic, 12 • Korean, 4
No student was able to speak more than one of these languages. Find the probability that a
student chosen at random is able to speak:
a Arabic b Korean c Mandarin d Italian
e Italian or Mandarin f Korean or Arabic g Italian or Korean or Mandarin
11 Discs labelled with the numbers 1 to 40 are placed in a bag and one disc is drawn out at
random. Find the probability that the disc drawn shows:
a the number 23 b a number less than 20
c a number that ends in 9 d a number with two equal digits
e a square number f a number with a 2 in the tens place
g a single-digit number h a prime number
i a number that is divisible by 7 j a number between but not including 10 and 20
k a number that contains the digit 1 l a number that ends in a 6 and is a multiple of 3
12 A regular pack of 52 playing cards is shuffled and one card is then drawn at random. What
is the probability that the card is:
a red? b a diamond? c an ace?
d the 4 of clubs? e a picture card? f a number card (2–10)?
g a black queen? h a red picture card? i a red 5 or a black 9?
j either red or black? k both red and black? l a black number card?
478 Mathscape 8
13 In a box there are three times as many gold counters as silver counters. If one counter is
drawn from the box, what is the probability that it is:
a gold? b silver?
3
14 The probability that a computer chip is faulty is found to be ------ .
40
How many faulty chips
would you expect to find in a batch of 600 chips?
15 Based on experience, a dentist finds that the probability of a child needing a filling is 2--7- .
Last month 42 children visited the surgery. How many fillings would the dentist have
expected to fill?
16 A die is rolled and a coin is tossed at the same time.
a List the sample space.
b What is the probability of:
i rolling a 4 and tossing a head?
ii rolling a prime number and tossing a tail?
iii rolling a 1 or 2 and tossing a tail?
iv rolling a number less than 5 and tossing a head?
17 Two-digit numbers are to be formed using the digits 3, 5, 6 and 9. Each digit can only be
used once.
a List the sample space.
b If one number is selected at random, find the probability that the number:
i is 56 ii begins with a 9 iii contains the digit 3
iv is less than 69 v is greater than 53 vi is odd
vii is prime viii is a multiple of 9 ix is not divisible by 5
18 Two-digit numbers are to be formed using the digits 0, 1, 2, 3, 4 and 5. Each digit can be
used more than once.
a List the sample space.
b If one number is selected at random, find the probability that the number:
i is 41 ii is even iii ends in a 1
iv has two equal digits v is a multiple of 5 vi is a square number
vii contains the digit 2 viii has a digit sum of 7 ix is not a prime
19 Two students from the group Richard, Harrison, Joel, Merhan and Greig are to be chosen
to represent their year group on the Student Representative Council.
a List the sample space.
b Find the probability that:
i Joel is chosen
ii Merhan is not chosen
iii Richard and Harrison are both chosen
iv Merhan is chosen but Greig is not
v at least one of Harrison and Joel is chosen
vi neither Joel nor Greig is chosen
Chapter 12: Data analysis and probability 479
21 A game is considered to be fair if all participants have an equal chance of winning. Decide
whether each of the following games is fair.
a Lily and Marlene toss two coins. Lily wins if the coins show the same on both faces.
Marlene wins if the coins show one of each face.
b A bag contains 99 counters individually numbered from 1 to 99. Gary and Richard take
turns to draw out counters, one at a time. Garry collects odd-numbered counters and
Richard collects even-numbered counters. The winner is the first player to collect
10 counters.
c On a roulette wheel, the numbers 0 to 36 are spaced equally around the edge. The
number 0 is coloured green. Of the remaining numbers, half are coloured red and the
others are coloured black. If when the wheel is spun and the ball lands on 0, the house
wins (unless a player has placed a bet on 0). If a player correctly bets that the ball will
come to rest on a red or a black number, then they will receive double their money back.
Maurice places a $20 bet that the ball will land on a red number.
d Lloyd and Ben are playing a game with two dice. Lloyd wins if a double is rolled by
either player. Ben wins if the sum of the numbers showing on the dice is 7.
e A bag contains only red, white and blue marbles. Half the marbles in the bag are red,
one-third are white and one-sixth are blue. Monique and Rochelle take turns to draw
one marble at a time from the bag. Monique wins if the marble drawn is red. Rochelle
wins if the marble is either white or blue.
f Craig and Melissa are playing a game with a regular pack of 52 playing cards. The pack
is divided into 4 suits with an equal number of cards: hearts, diamonds, spades and
clubs. There are 12 face cards in a pack: 4 kings, 4 queens and 4 jacks. The cards are
shuffled and placed face down on a table. One card at a time is then turned over and
placed face up. If this card is a heart, Melissa wins. If the card is a face card, Craig wins.
■ Further applications
22 A 5 cent, a 10 cent and a 20 cent coin are tossed in the air at the same time.
a List the sample space.
b What is the probability that the coins show:
i 3 heads? ii 2 heads and a tail? iii 1 head and 2 tails? iv 3 tails?
480 Mathscape 8
23 A card is chosen at random from a regular pack of 52 playing cards. Find the probability
that the card is:
a black b a jack c black or a jack
d a7 e a heart f a 7 or a heart
g a diamond h a picture card i a diamond or a picture card
event.
If the probability that an event E occurs is P(E), then the probability that the event
does not occur is P ( Ẽ ) , where:
P ( E ) + P ( Ẽ ) = 1 or P ( Ẽ ) = 1 – P ( E ) .
Example 1
EG State the complement of each event.
+S
a tossing a head with a coin
b spinning a number that is greater than 10 on a wheel
c choosing a heart or a club from a pack of cards
d not winning a game of hockey
Solutions
a tossing a tail with a coin
b spinning a number that is less than or equal to 10 on a wheel
c choosing a diamond or a spade from a pack of cards
d winning a game of hockey
Chapter 12: Data analysis and probability 481
Example 2
EG A bag contains 9 marbles, 5 of which are red. Let A represent the event ‘drawing a red marble’.
+S If one marble is drawn at random from the bag, find:
a P(A) b P Ã c P ( A ) + P ( Ã )
Solutions
a P(A) = P(red marble) b P ( Ã ) = P(not a red marble) c P ( A ) + P ( Ã )
= 5--9- = 1 − P(A) = 5--9- + 4--9-
= 1 − 5--9- =1
= 4--9-
Exercise 12.13
4 A bag contains 10 counters, of which 7 are black. Let E represent the event ‘drawing a black
counter’. If one counter is drawn at random from the bag, find:
a P(E) b P ( Ẽ ) c P ( E ) + P ( Ẽ )
5 The events E and F are complementary events. What is the value of P(E) + P(F)?
6 What can you say about the events A and B if P(A) + P(B) = 1?
■ Consolidation
7 A barrel contains 7 blue discs, 4 orange discs and 9 purple discs. Find, as a decimal, the
probability that a disc drawn at random from the barrel is:
a purple b blue c orange
d not purple e not blue f not orange
482 Mathscape 8
8 If a bag contains only 5 red marbles, show that the probability of choosing:
a a red marble is 1 b a marble that is not red is 0
9 The letters in the word TRIANGLE are placed in a bag. A letter is then drawn at random.
Find the probability of:
a drawing the letter G b drawing the letter T or E
c drawing a vowel d not drawing the letter G
e not drawing the letter T or E f not drawing a vowel
10 The letters of the word PROBABILITY are placed in a bag. A letter is then drawn at
random. Find the probability of:
a i drawing the letter R, I or B ii not drawing the letter R, I or B
b i drawing a consonant ii not drawing a consonant
c i drawing a letter before S in the ii not drawing a letter before S in the
alphabet alphabet
11 The numbers from 1 to 25 are written on individual pieces of paper and placed in a box.
One piece of paper is then drawn out at random. Find the probability that the number is:
a 23 b not 23 c even
d not even e prime f not prime
g less than 12 h not less than 12 i divisible by 4
j not divisible by 4 k a 2-digit number l not a 2-digit number
12 The traffic lights at a certain intersection show red 45% of the time, amber 15% of the time
and green the rest of the time. If I drive through this intersection, what is the probability that
the lights will be:
a red? b not green? c green or amber? d neither red nor green?
13 In a box of chocolates, there are twice as many chocolates with soft centres as there are
with hard centres. One chocolate is selected at random. What is the probability that the
chocolate is:
a soft-centre? b hard-centred? c not soft-centred? d not hard-centred?
14 The ratio of pink to brown to orange balls in a barrel is 6 : 9 : 5. If one ball is drawn at
random, find the probability that it will be:
a brown b orange c pink
d orange or brown e not pink f neither orange nor pink
15 A card is drawn at random from a regular pack of playing cards. Find the probability that
this card is:
a the 5 of spades b not the 5 of spades c black
d not black e a diamond f not a diamond
g an ace h not an ace i a face card
j not a face card k a number card l not a number card
Chapter 12: Data analysis and probability 483
■ Further applications
A Venn diagram consists of a number of intersecting or 45
non-intersecting circles. It may be used to help answer Bus Train
probability problems such as the one below. This Venn 15 10 20
diagram shows the number of students in Year 8 who
travel to school by bus or train or both. In this case,
25 students travel by bus, 30 students travel by train,
10 students travel by both bus and train and 45 students use neither form of transport.
16 In a class of 30 Year 8 students, 18 students study French, 15 students study Japanese and
4 students study neither language.
a Draw a Venn diagram to illustrate this data.
b How many students study both languages?
c A student from this class is selected at random. Find the probability that he/she:
i speaks French ii speaks Japanese
iii speaks both languages iv speaks neither language
v speaks French but not Japanese vi speaks Japanese but not French
vii does not speak French viii does not speak Japanese
ix does not speak both languages x speaks either French or Japanese or both
Introduction
Each year 18 000 Australians—about 50 people per day—die prematurely from smoking.
Because it has such a devastating effect on health and consequently on economic and social
well-being, the advertising of cigarettes is forbidden and manufacturers must place a health
warning on the packet. Smoking has been banned by law in many public places and on public
transport, for example. About 140 people die each year from inhaling other people’s smoke.
484 Mathscape 8
Tobacco is an addictive substance. Smokers who also use marijuana, heroin, amphetamines or
other drugs rate tobacco as more addictive. Surveys have shown that up to 80% of smokers
would like to quit. In Australia more than 2.9 million people have succeeded in doing so, but
many others never do. In fact a quarter of smokers believe that smoking is not harmful. The
tobacco industry spends more than $70 million on cigarette advertising and promotion each
year. Much of this is directed at school children. The following data will help you to see that
many adults begin smoking at secondary school.
Focus question
What proportion of NSW secondary-school students aged 12–17 years smoke? Is there a
difference in smoking rates between boys and girls?
2L EARNING ACTIVITIES
Teacher note: These activities are most suitable for collaborative group work. The focus
question and the data will stimulate a lot of discussion about the statistical evidence used by
governments to justify the prohibition of smoking in public places, and support the national
‘Quit for Life’ campaign. This can be followed up by exploring smoking and health data on the
web. The groups might discuss carrying out an anonymous class survey to get data of their own.
This survey data shows the percentage of NSW Percentage of NSW school children
secondary school children who reported smoking who smoked in the last week:
in the week prior to the survey in 1999. The sample 1999 Survey Data
size was 7475 students.
Age (years) Boys (%) Girls (%)
1 Choose a suitable way to represent the data
12 5 5
on a graph to show the trends as age increases.
Plot the data for boys and girls on the same 13 11 9
graph to enable comparison. 14 16 19
2 Compare the data for boys and girls. 15 21 23
What do you notice? 16 28 30
3 Between what ages does the biggest increase in 17 27 30
smoking occur for boys? for girls? Can you
give any reason for this?
4 Which of the following statistics do you feel are useful for summarising the data:
mode, mean, median, range? Can you say why?
5 Make an estimate of the answer to the focus question above. Write down any further
information you would have liked to make this estimate.
Chapter 12: Data analysis and probability 485
8E XTENSION ACTIVITIES
The Statistical Bulletin from the NSW Health Department from which the above data was
taken, concludes:
1 ‘Overall, 18% of males and 19% of females reported smoking in the past week.’
Does this confirm your estimate of the answer to the focus question?
2 ‘Based on the survey results, it is estimated that in 1999 approximately 85 000 NSW
secondary-school students aged 12–17 years had smoked in the last week (41 800 males
and 43 200 females) of whom approximately 30 100 were aged 12–14 years.’
How were these estimates of the total number of students in NSW aged 12–17 years
obtained when the sample size was only 7475 students? Discuss this with your teacher.
Can you make any other conclusions?
3 In Australia, $6763 billion or 47% of the total economic cost of drug abuse is attributable
to tobacco. This includes $609.6 million in direct healthcare costs and $6028.3 million in
indirect mortality costs. See if you can find out more details of the economic burden that
smoking places on the nation.
4 Australia has about 5.3 million smokers. They smoke an average 18 cigarettes per day.
If a packet contains 25 cigarettes and costs $9.40 estimate the money spent on cigarettes in
Australia each day. Each year? How much is recouped by the government in tax? See if you
can find out from the Internet. This revenue is used to offset the enormous health and death
costs involved.
E L ET’S COMMUNICATE
Discuss in class what you have learned about the number of students who begin to smoke at
secondary school in NSW. Check out the Internet for further information about this serious
health risk for students.
%R EFLECTING
You have probably heard of the term ‘the information age’. Think over the importance of
gathering statistics to provide reliable information for the welfare of the community. Other than
the health effects of smoking, can you think of important areas in which governments have
sought information to make important policy decisions?
486 Mathscape 8
B P ROBLEM SOLVING
1 How many small cubes are in this solid? 2 Complete this magic square.
What is the sum of each row
and diagonal?
16 3 2
10 11 8
9 7
3 Can you find the next two elements 1
in this sequence?
a + b a + b + c a + b + c + d .........
a, ------------, ---------------------, -------------------------------, ---------------------
2 4 8 ..
4 If two dogs ate 6 cans of pet food in two days, how many cans of pet food would
3 dogs eat in 4 days, at this rate?
5 What number is less than 100, has two factors, its digits add up to 11, and the
difference between the digits is 3?
6 How far can a wombat travel into a forest?
7 In this problem you will need 4 black checkers
(draughtsmen) and five white ones. (Or four 5c coins
and five 10c coins.) The aim is to swap the positions of
black and white checkers in the least number of moves.
You may move a piece into an empty space next to it.
10 The magic square in Q2 has other hidden secrets. Total the corners! Total some of the
2 × 2 squares contained in it.
Chapter 12: Data analysis and probability 487
1 Numerical information is also called frequency noun (plural frequencies) 1. (uncount) the
d________. fact of happening often: He was annoyed by the
2 Explain the difference between frequency of her visits. 2. the rate at which something
experimental and theoretical probability. happens: the frequency of a pulse. 3. (uncount)
3 A survey of a population is called a Specialised the rate of cycles or vibrations of a wave
movement: His radio only picks up stations on a high
c________. frequency.
4 Define mean for a new Maths Dictionary.
5 Read the Macquarie Learners’ Dictionary How is the mathematical use of this word
entry for frequency: unique?
1 State whether a census or sample should 4 Comment on the bias in each of these
be used to gather information about the statements.
VIEW
number of: a Dentists recommend ‘Bright and
a minties in the average packet White’ toothpaste.
b patients that have been cured by a b 5 out of 6 parents recommend ‘Dry
new drug Baby’ disposable nappies.
c empty fire extinguishers in one c ‘Supa White’ gets whites 30% whiter
particular building and brighter.
CHAPTER RE
d colour-blind people in Australia 5 Explain why the following are poor
2 Explain why each of these survey survey questions.
methods may give a biased result. a ‘Do you own a car?’ in a survey about
a a talk-back radio telephone poll favourite pizza toppings.
b a door-to-door interview conducted b ‘Should stupid taxi drivers who don’t
between 9:00 am and 2:00 pm on know their way around the city lose
Monday their taxi licence?’
c a postal questionnaire c ‘How enjoyable was the movie?’
3 Who would you survey if you wanted to
gather support for: 1 2 3 4 5
a a late-night curfew on flights from an 6 How could you use the random number
airport? function on a calculator to choose
b an end to the logging of trees in old- 8 people from a list of 100 names so that
growth forests? each person has an equal chance of being
chosen?
CHAPTER REVIEW
488 Mathscape 8
CHAPTER RE
7 How could 5 people be chosen at random 15 The range of a set of scores is 23. Find:
from a group of 50 people by using a the highest score if the lowest score
systematic random sampling? is 18
8 A group of 18 students is to be chosen b the lowest score if the highest score
from a group of 40 boys and 50 girls by is 67
using stratified random sampling. How 16 Find the mean, median, mode and range
many boys and girls should be chosen? of the scores shown in this histogram.
9 a In a batch of 1500 calculators, every 6
20th calculator is to be checked for
Frequency
faults. How many calculators need to 4
be checked?
b If 1800 pens are produced each day 2
by a manufacturer, find at what
intervals the pens should be checked 0
in a systematic random sample of 9 10 11 12 13 14
40 pens. Score
VIEW
10 For each set of scores, find the: 17 a Draw a plot to represent the scores
i mean ii median iii mode iv range below.
a 21, 9, 30, 25, 9, 22, 10 5 7 6 5 4
b 56, 53, 85, 67, 72, 56, 61, 72 6 5 7 4 1
11 In which position is the median in a set of: 5 6 5 7 7
a 17 scores? b 83 scores? 8 4 7 5 6
12 Find the positions of the two middle b Which score(s), if any, are outliers?
scores in a set of: c Which score is the mode?
a 28 scores b 112 scores d Find the mean, correct to 1 decimal
13 How many scores are there in a place.
distribution if the median lies: 18 A group of 40 history students obtained
a in the 43rd position? the following results on a 12-question
b between the 29th and 30th scores? quiz.
14 Find the mean, median, mode and range 7 12 8 10 11 9 8 9
of each set of scores. Answer correct to 10 12 12 11 10 9 9 8
1 decimal place where necessary. 8 7 11 8 8 11 7 8
12 8 10 9 10 8 9 7
a Score 1 2 3 4 5
9 9 8 10 9 6 7 10
Frequency 11 9 7 4 6 a Organise the data into a frequency
distribution table with score (x), tally,
b Score 23 24 25 26 27
frequency (f ) and fx columns.
Frequency 4 6 7 15 22 b Find Σf and Σfx and hence calculate
the mean, correct to 1 decimal place.
CHAPTER REVIEW
Chapter 12: Data analysis and probability 489
c Which score is the mode? 22 The side-by-side column graph shows the
VIEW
d Find the range. number of goals scored each season by
e Which score occurred 5 times? two hockey teams from 2000 to 2004.
f What percentage of the scores are 7s? Hockey goals scored
g Draw a frequency histogram and 40
35
polygon for this data on the same set 30
Goals scored
of axes. 25
CHAPTER RE
Robins
19 a The sum of 15 scores is 120. Find the 20
Wrens
mean of the scores. 15
b A set of 22 scores has a mean of 7.5. 10
5
Find the sum of the scores. 0
c A set of scores has a mean of 14 and 2000 2001 2002 2003 2004
a sum of 238. Find the number of Year
scores. a
Which team scored more goals in
20 Brent scored 14, 13, 16 and 13 out of 20 2003 and by what margin?
for his first four English essays. What b In which year did the teams score the
mark does he need to get on the next same number of goals?
essay to raise his average to 15 out c In which year did one team score the
of 20? greatest number of goals? How many?
21 This stem-and-leaf plot shows the results d Calculate the goal average for each
obtained by a group of Year 8 students on team over this 5-year period. Which
their statistics test, which was marked out team performed better?
of 40. 23 The data below shows the number of
patients treated in the emergency wards
Stem Leaf of two major hospitals over a period of
1(5) 7 9 9 20 days.
2(0) 0 1 1 2 3 St George Nepean
2(5) 5 6 8 8 8 9 9 32 40 34 41 56 37 35 28 24 43
3(0) 1 2 2 2 3 4 42 33 37 38 49 51 46 52 45 44
3(5) 6 6 7 9 85 29 43 49 51 33 35 29 36 38
a How many students sat for the test? 53 45 46 45 42 49 62 26 58 47
b Calculate the mean, correct to a Draw a back-to-back stem-and-leaf
2 decimal places. plot for this data.
c Find the median. b Find the range of the number of
d Write down the mode. patients treated at each hospital.
e Find the range. c A serious bus accident occurred near
f How many students scored less than one of these hospitals. Which
50%? hospital?
d Which hospital treated the greater
number of patients each day on
average?
CHAPTER REVIEW
490 Mathscape 8
CHAPTER RE
CHAPTER REVIEW