1.8*_Quality of tests
1.8*_Quality of tests
Objectives
A!er completing this chapter you should be able to:
● Know about Type I and Type II errors → pages 147–153
146
Quality of tests
Example 1
One rainy day during the summer holidays, a family of four were playing a simple game of cards.
The game was one of chance so the probability of any particular person winning should have
been _4 . After playing a number of games, Robert complained that his younger sister Sarah must
1
have been cheating as she kept winning. Their parents quickly intervened and decided to carry out
a proper investigation and carefully watched the next 20 games.
Find the critical region for a one-tailed test using a 5% level of significance.
H0 : p = _41 H1 : p > _41 If Sarah is cheating then you would expect the
proportion of games she wins to be more than _14 .
Let X = the number of games Sarah wins out
of the next 20.
State the distribution of the statistic assuming H0
So X ∼ B(20, _41 )
is true.
Reject H0 if X > c where P(X > c) , 0.05.
From tables: Use tables to find the smallest value of c with
P(X < 8) = 0.9591 so P(X > 9) = 0.0409 P(X > c) , 0.05
In the example above, if Sarah wins 9 or more games, her parents will reject the null hypothesis, and
conclude that p > _14 (or in other words, that Sarah was cheating). It is possible that this conclusion
will be incorrect. If p = _14 , Sarah might still win 9 or more games by chance. The probability of this
occurring is 0.0409, or the actual significance level of the test. This is called a Type I error.
■ A Type I error is when you reject H0, but H0 is in fact true. The probability of a Type I error is
the same as the actual significance level of the hypothesis test.
It is also possible that Sarah was cheating, but that she still only wins 8 or fewer games. In this case
her parents would accept the null hypothesis, and conclude incorrectly that p = _4. This is called a
1
Type II error.
147
Chapter 8
A This table summarises the types of error that can occur in a hypothesis test:
Truth
H0 is true H0 is false
Accept H0 OK Type II error
Conclusion of test
Reject H0 Type I error OK
Example 2
Example 3
Accidents occurred on a stretch of motorway at an average rate of 6 per month. Many of the
accidents that occurred involved vehicles skidding into the back of other vehicles. By way of a trial,
a new type of road surface that is said to reduce the risk of vehicles skidding is laid on this stretch
of road, and during the first month of operation 4 accidents occurred.
a Test this result to see if it gives evidence that there has been an improvement at the 5% level of
significance.
b Calculate P(Type I error) for this test.
c If the true average rate of accidents occurring with the new type of road surface was 3.5,
calculate the probability of a Type II error.
148
Quality of tests
A
a You are dealing with a Poisson distribution.
Let λ = the average number of accidents in a month, and
X = the number of accidents in any given month, then the
hypotheses are
H0 : λ = 6 (i.e. no change) Part a is a hypothesis test for the
H1 : λ < 6 (i.e. fewer accidents) mean of a Poisson distribution.
From tables P(X < 4|λ = 6) = 0.2851. ← Section 5.1
You can also calculate the probabilities of errors from a two-tailed hypothesis test.
Example 4
A coin is spun 20 times and a head is obtained on 7 occasions.
a Test to see whether or not the coin is biased.
b Calculate the probability of a Type I error for this test.
c Given that the coin is biased and that this bias causes the tail to appear 3 times for each head
that appears, calculate the probability of a Type II error for the test.
149
Chapter 8
A
a The hypotheses are This is a test for the proportion of a binomial
distribution, and since you are testing to see if
H0 : p = 0.5 H1 : p ≠ 0.5
the coin is biased in either direction, a two-tailed
Let X = the number of heads in 20 spins test has to be used.
of the coin.
Assuming H0 is true then X ∼ B(20, 0.5).
For a two-tailed test, at the 5% significance
level, you require values c1 and c2 so that The critical region will be in two parts.
P(X < c1) < 0.025 and P(X > c2) < 0.025
(or P(X < c2 − 1) > 0.975).
From tables: P(X < 6) = 0.0577
and P(X < 5) = 0.0207
so the value of c1 = 5.
Also: P(X > 14) = 1 − P(X < 13)
= 1 − 0.9423
Alternatively P(X < 13) = 0.9423 and
= 0.0577
P(X < 14) = 0.9793 so c2 − 1 = 14 and c2 = 15.
P(X > 15) = 1 − P(X < 14)
= 1 − 0.9793
= 0.0207
so the value of c2 = 15.
Thus the critical region for X is X < 5 or Problem-solving
X > 15.
Notice that since p = 0.5 the two tails are
As 7 falls between 5 and 15 there is symmetrical about the mean of 10 and the value
insufficient evidence to reject H0. of c2 could have been inferred from that of c1 in
The coin is not biased. this case.
b A Type I error occurs when you reject H0
but H0 is true, and this occurs when X < 5
In this case there are two probabilities to be
or X > 15.
found and added.
P(Type I error) = P(X < 5|p = 0.5)
+ P(X > 15|p = 0.5)
= 0.0207 + 0.0207
= 0.0414
150
Quality of tests
Example 5
A
Jane knows from experience that 10% of the emails she receives are spam. After her email service
upgraded the spam filters, she recorded the number of emails sent up to and including the first spam
email. She wants to test, at the 5% significance level, whether this upgrade improved the spam filter.
a Find the critical region for her test.
b Calculate the probability of a Type I error for this test.
c Given that after the upgrade the probability of an email she receives being spam is now 1 in a
100, calculate the probability of a Type II error for the test.
151
Chapter 8
Exercise 8A
A
1 The random variable X is binomially distributed. A sample of 10 is taken, and it is desired to
test H0 : p = 0.25 against H1 : p > 0.25, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true value of p was
later found to be 0.30, calculate the probability of a Type II error.
4 The random variable X has a Poisson distribution. A sample is taken, and it is desired to test
H0 : λ = 6 against H1 : λ > 6, using a 5% level of significance.
a Find the critical region for this test.
b Calculate the probability of a Type I error and, given that the true value of λ was later found
to be 7, calculate the probability of a Type II error.
5 The random variable X has a Poisson distribution. A sample is taken, and it is desired to test
H0 : λ = 4.5 against H1 : λ < 4.5, using a 5% level of significance.
a Find the critical region for this test.
b Calculate the probability of a Type I error and, given that the true value of λ was later found
to be 3.5, calculate the probability of a Type II error.
6 The random variable X has a Poisson distribution. A sample is taken, and it is desired to test
H0 : λ = 9 against H1 : λ ≠ 9, using a 5% level of significance.
a Find the critical region for this test.
b Calculate the probability of a Type I error and, given that the true value of λ was later found
to be 8, calculate the probability of a Type II error.
7 The random variable X is geometrically distributed, and it is desired to test H0 : p = 0.2 against
H1 : p < 0.2, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
found to be p = 0.05, calculate the probability of a Type II error.
152
Quality of tests
A 8 The random variable X is geometrically distributed, and it is desired to test H0 : p = 0.02 against
H1 : p < 0.02, using a 1% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
found to be p = 0.01, calculate the probability of a Type II error.
9 The random variable X is geometrically distributed, and it is desired to test H0 : p = 0.01 against
H1 : p ≠ 0.01, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
found to be p = 0.1, calculate the probability of a Type II error.
E/P 10 a Define:
i a Type I error (1 mark)
ii a Type II error. (1 mark)
The discrete random variable X ∼ Geo( p). You wish to test H0 : p = 0.004 against H1 : p ≠ 0.004,
using a 10% significance level. The probability in each tail should be as close to 0.05 as possible.
b Find the critical region for this test. (7 marks)
c State the probability of a Type I error occurring for this test. (1 mark)
E/P 11 Michael has bought a dice with 20 sides, and his friend David suspects that it is landing on 17
more often than it is landing on the other values. They both decide to test this in two different
ways, using a 5% significance level. Michael throws the dice 40 times and records the number of
times the dice lands on the 17.
a Find the critical region for Michael’s test. (4 marks)
b State the probability of a Type I error occurring for Michael’s test. (1 mark)
David decides to throw the dice until the first time it lands on 17.
c Find the critical region for David’s test. (4 marks)
d State the probability of a Type I error occurring. (1 mark)
The actual probability of the dice landing on 17 is 0.0588.
e Calculate the probability of a Type II error occurring in David’s test. (2 marks)
f Calculate the probability of a Type II error occurring in Michael’s test. (2 marks)
8.2 Finding Type I and Type II errors using the normal distribution
You need to be able to find Type I and Type II errors using the normal distribution.
153
Chapter 8
A In the examples in the previous section P(Type I error), which gives the actual significance level, was
not equal to the target significance level. This was due to the discrete nature of the distributions used.
■ When a continuous distribution such as the normal distribution is used then P(Type I error)
is equal to the significance level of the test.
Example 6
Bags of sugar having a nominal weight of 1 kg are filled by a machine. From past experience it is
known that the weight, X kg, of sugar in the bags is normally distributed with a standard deviation
of 0.04 kg. At the beginning of each week a random sample of 10 bags is taken in order to see if
the machine needs to be reset. A test is then done at the 5% significance level with
H0 : μ = 1.00 kg and H1 : μ ≠ 1.00 kg. Find:
Online Explore
a the critical region for this test
probabilities of Type I
b P(Type I error) for this test. and Type II errors in a normal
Assuming that the mean weight has in fact changed to 1.02 kg, distribution using GeoGebra.
c find P(Type II error) for this test.
_
a The distribution of X is modelled by Since this is a two-tailed test you allow 2.5% at
N(1.0, ______).
0.042 each tail.
10
From the tables the critical region for Z is The critical region is found by rearranging
√
0.042
______ Notice once again that the critical region is in two
x¯ = 1 ± 1.96 ×
10 parts.
b P(Type I error) for this test will be the 0.9752 1.0 1.0248 X
same as the significance level = 0.05.
√ 0.042
on your calculator, with σ = _____ = 0.01249 …
10
154
Quality of tests
A When carrying out hypothesis tests, you want to keep P(Type I error) and P(Type II error) as low as
possible. The following example illustrates the relationship between Type I and Type II errors.
Example 7
The weight of jam in a jar, measured in grams, is distributed normally with a mean of 150 g and
a standard deviation of 6 g. The production process occasionally leads to a change in the mean
weight of jam per jar but the standard deviation remains unaltered.
The manager monitors the production process and for every new batch takes a random sample of 25
jars and weighs their contents to see if there has been any reduction in the mean weight of jam per jar.
_
Find the critical values for the test statistic X , the mean weight of jam in a sample of 25 jars, using:
a a 5% level of significance
b a 1% level of significance.
Given that the true value of μ for the new batch is in fact 147,
c find the probability of a Type II error for each of the above critical regions.
= 0.1963 (4 d.p.) √
62
Use your calculator with σ = ___ = 1.2.
25
1% test
_ P(Type II error)
= P(X > 147.2084 | µ = 147)
= 0.4311 (4 d.p.)
155
Chapter 8
A Notice how in this example if we try to reduce P(Type I error) from 5% to 1% then P(Type II error)
increases from 0.1963 to 0.4311. A more detailed study of the interplay between these two
probabilities follows later in this chapter. However, you should be aware of this phenomenon and
appreciate one of the reasons why we do not always use a significance level that is very small.
The value of 5% is a commonly used level and, in a situation where a particular significance level is
not given, this value is recommended.
This does not mean that other significance levels are never used. When, for example, the results of
the research are highly important and making a Type I error could be very serious, a 1% significance
level might be used. In other cases a significance level of 10% might be used. An alternative method
of reducing the probability of a Type II error is to increase the sample size but this can increase the
cost or duration of a survey or experiment.
The relationship between the probabilities of Type I and Type II errors can be illustrated by imagining
pushing down on one side of a balloon.
The only way to push down on both sides at once (and reduce the overall thickness) is to allow the
air to move sideways. Using a larger balloon would allow you to reduce the overall thickness (this is
equivalent to increasing the size of the sample n).
Exercise 8B
1 The random variable X ∼ N(μ, 32). A random sample of 20 observations of X is taken, and the
sample mean x¯ is taken to be the test statistic. It is desired to test H0 : μ 5 50 against
H1 : μ > 50, using a 1% level of significance.
a Find the critical region for this test.
b State the probability of a Type I error for this test.
Given that the true mean was later found to be 53,
c find the probability of a Type II error.
2 The random variable X ∼ N(μ, 22). A random sample of 16 observations of X is taken, and the
sample mean x¯ is taken to be the test statistic. It is desired to test H0 : μ 5 30 against
H1 : μ < 30, using a 5% level of significance.
a Find the critical region for this test.
b State the probability of a Type I error for this test.
Given that the true mean was later found to be 28.5,
c find the probability of a Type II error.
156
Quality of tests
A 3 The random variable X ∼ N(μ, 4 ). A random sample of 25 observations of X is taken, and the
2
E 4 A manufacturer claims that the average outside diameter of a particular washer produced by his
factory is 15 mm. The diameter is assumed to be normally distributed with a standard deviation
of 1 mm. The manufacturer decides to take a random sample of 25 washers from each day’s
production in order to monitor any changes in the mean diameter.
a Using a significance level of 5%, find the critical region to be used for this test. (4 marks)
Given that the average diameter had in fact increased to 15.6 mm,
b find the probability that the day’s production would be wrongly accepted. (2 marks)
E/P 5 The number of patients that a medic can inoculate with a vaccine in one day can be modelled by
a normal distribution with mean 40 and standard deviation 8. The manufacturer of the vaccine
claims that a new method of inoculation will speed up the rate at which the medic works.
A random sample of 30 medics tried _ out the new method of inoculation and the average number
of patients they dealt with per day X was recorded.
_
a Using a 5% significance level, find the critical value of X . (4 marks)
The average number of patients dealt with per day using the new method of inoculation was in
fact 42.
b Find the probability of making a Type II error. (2 marks)
The manufacturer of the vaccine would like to lessen the probability of a Type II error being
made and recommends that the significance level be changed.
c State, giving a reason, what recommendation you would make. (1 mark)
157
Chapter 8
A ■ The power of a test is the probability of rejecting the null hypothesis when it is not true.
Power = 1 − P(Type II error)
= P(being in the critical region when H0 is false)
The greater the power of a test, the greater the probability of rejecting H0 when H0 is false. It follows
that the higher the power, the better the test.
The table on page 148 can now be rewritten to show the probabilities for the different situations.
Truth
H0 is true H0 is false
Accept H0 OK P(Type II error)
Conclusion of test
Reject H0 Size = P(Type I error) Power = 1 − (Type II error)
Example 8
The random variable X has a binomial distribution. A random sample of size 25 was taken to test
H0 : p = 0.30 against H1 : p < 0.30 using a 10% level of significance.
a Find the critical region for this test.
b Find the size of this test.
Given that p = 0.20,
c calculate the power of this test.
a X ∼ B(25, p)
H0: p = 0.30 H1: p < 0.30
Assume H0 so that X ∼ (25, 0.30).
H0 is rejected when X < c where
P(X < c) < 0.10.
From tables:
P(X < 4) = 0.0905 Use tables of B(25, 0.30).
P(X < 5) = 0.1935
So the critical region is X < 4.
158
Quality of tests
A
b Size = P(Type I error) The size is the actual significance level of
= P(X < 4| p = 0.30) the test. Use your calculator to find
= 0.0905 P(X < 4 | p = 0.30).
Example 9
Jam is sold in jars. The amount of jam, in grams, in a jar is normally distributed with mean μ and
standard deviation 5. The manufacturer claims that μ is 106 and quality control officers will take
action against the manufacturer if μ < 106. A random sample of 30 jars is examined and a 5% level
of significance is used.
a Find the critical region for the sample mean using this test.
Given that in fact μ = 102,
b find the power of this test.
n = 30 so X ∼ N(106, ___)
_ 52
30
_ Assuming H0 is true, state the distribution of the
Reject H0 when X < c statistic.
Critical region for z is Z < −1.6449
_
X − 106 Use tables to find the critical region for Z.
So ________
___
5
< −1.6449
___
√ 30
_
i.e. X < 104.498…
_
If µ = 102 then X ∼ N(102, ___).
_ 52
b Power = P(X < 104.498…|µ = 102)
30
= 0.9968 (4 d.p.)
Example 10
A particular mobile-phone provider fails to deliver text messages with probability p.
Brooke wants to investigate whether p > 0.02.
Using H0: p = 0.02 and H1: p > 0.02, Brooke notes the number of text messages she is able to send
successfully up until the first failure. If this value is less than or equal to 5 she rejects H0. If it is
more than 100 she accepts H0. If it is more than 5 but less than or equal to 100 she notes the
number of additional text messages she is able to send successfully up until the next failure.
She rejects H0 if this is less than or equal to 5 and accepts it otherwise.
a Find the size of this test.
b Calculate the power of this test when p = 0.015.
159
Chapter 8
A
a Let X = number of messages sent up to
and including first failure
Then X ∼ Geo( p) You need to calculate the probability that H0 is
Assume H0 is true, so that X ∼ Geo(0.02). rejected, assuming that it is true. You are given
P(X < 5) = 1 − (1 − 0.02)5 the critical region, so find P(H0 rejected | p = 0.02).
= 0.09607…
P(5 < X < 100) Problem-solving
= P(X < 100) − P(X < 5) You can use Geo(0.02) to model the number
= 1 − (1 − 0.02)100 − 0.09607… of text messages up to and including the first
= 0.77130… failure. A!er the first failure, the number of text
messages up to and including the next failure
P(H0 rejected| p = 0.02)
also has distribution Geo(0.02).
= P(X < 5) + P(5 < X < 100) × P(X < 5)
= 0.09607… + 0.77130… × 0.09607…
= 0.17018…
The size of the test is 0.1702 (4 d.p.).
The power of the test is
b Assume p = 0.015 so that Y ∼ Geo(0.015). P(H0 is rejected | p = 0.015). Repeat your
P(Y < 5) = 1 − (1 − 0.015)5 = 0.07278… calculation using a different assumed value of p.
P(5 < Y < 100)
= P(Y < 100) − P(Y < 5)
= 1 − (1 − 0.015)100 − 0.07278…
= 0.70660…
P(H0 rejected| p = 0.015)
= P(Y < 5) + P(5 < Y < 100) × P(Y < 5)
= 0.12421…
The power of the test when p = 0.015 is This is quite a small value for the power of the
0.1242 (4 d.p.). test. This suggests that the test is not very useful
when p = 0.15.
Exercise 8C
1 The random variable X ∼ N(μ, 32). A random sample of 25 observations of X is taken and the
sample mean x¯ is taken as the test statistic. It is desired to test H0 : μ = 20 against H1 : μ > 20
using a 5% significance level.
a Find the critical region for this test.
b Given that μ = 20.8, find the power of this test.
2 The random variable X has a binomial distribution. A sample of 20 is taken from it. It is
desired to test H0 : p = 0.35 against H1 : p > 0.35 using a 5% significance level.
a Calculate the size of this test.
b Given that p = 0.36, calculate the power of this test.
3 The random variable X has a Poisson distribution. A sample is taken and it is desired to test
H0 : λ = 4.5 against H1 : λ < 4.5 using a 5% significance level.
a Find the size of this test.
b Given that λ = 4.1, find the power of this test.
160
Quality of tests
A 4 A manufacturer claims that a particular rivet produced in his factory has a diameter of 2 mm,
E
and that the diameter is normally distributed with a variance of 0.004 mm2.
A random sample of 25 rivets is taken from a day’s production to test whether the mean
diameter had altered, up or down, from the stated figure. A 5% significance level is to be used
for this test.
If the mean diameter had in fact altered to 2.02 mm, calculate the power of this test. (5 marks)
E/P 5 In a binomial experiment consisting of 10 trials the random variable X represents the number
of successes, and p is the probability of a success.
In a test of H0 : p = 0.3 against H1 : p . 0.3, a critical region of x > 7 is used.
Find the power of this test when
a p = 0.4 (3 marks)
b p = 0.8. (3 marks)
c Comment on your results. (1 mark)
7 The random variable X has a geometric distribution. It is desired to test H0: p = 0.01 against
H1: p > 0.01 using a 5% significance level.
a Find the critical region for this test.
b Given that p = 0.2, calculate the power of this test.
P 8 The random variable X has a geometric distribution. It is desired to test H0: p = 0.01 against
H1: p ≠ 0.01 using a 5% significance level.
a Find the critical region for this test.
b Given that p = 0.02, calculate the power of this test.
E/P 9 The wallpaper produced by a certain manufacturer has defects that occur randomly at a
constant rate of λ per roll. If λ is thought to be greater than 0.8 then action has to be taken.
Using H0: λ = 0.8 and H1: λ > 0.8, a quality control manager takes a sample of 10 rolls and
rejects H0 if there are 12 or more defects. If there are 9 or fewer defects then H0 is accepted.
If there are 10 or 11 defects, a second sample of 10 rolls is taken and H0 is rejected if there are
8 or more defects in this second sample, otherwise it is accepted.
a Find the size of this test. (4 marks)
b Find the power of this test when λ = 1. (3 marks)
161
Chapter 8
A 10 A sweet manufacturer makes boxes of jelly beans. The number of jelly beans in each box is
E/P assumed to be normally distributed with standard deviation 5.
A consumer group wants to test the manufacturer’s claim that the mean number of jelly beans
in each box is 80. The group takes repeated samples of 20 boxes and records the mean number
of jelly beans per box in each sample.
The random variable X represents the number of samples the group need to take before they
obtain a sample with a mean less than 79.
If X < 10 the group rejects the company’s claim.
a Find the size of this test. (5 marks)
b Given that the actual mean number of jelly beans in each box is 81, find the power
of this test. (5 marks)
Challenge
A jam factory has an automated system for sealing their jars, and the
expected probability of error when the machines are well calibrated is
8%. The jars are sealed and placed into boxes of 60. To see whether the
machine that sealed all the jars in a specific box needs recalibrating,
a series of tests is performed. The first box is inspected by taking a
sample of 20 jars and performing a test, with 5% significance level,
to see whether the probability of a defective seal is greater than 8%.
If the first box fails the test they conclude that the machine needs
recalibrating, but if it passes the test they move on to the second box,
and perform the same test. Once again, if the box fails the test they
conclude that the machine needs recalibrating, but if it passes the test
they move on to the next box, and so on until a box fails the test.
a What is the maximum number of boxes that can be inspected such
that the probability of a Type I error is smaller than 10%?
The factory decides to conclude that the machine does not need
recalibrating if the first four boxes all pass the test.
b Given that a!er the second box the machine is decalibrated, increasing
the probability of a defective seal to 20%, find the power of the test,
knowing that only 4 boxes were inspected. You may assume that the
probability of a defective seal for each jar in the first two boxes is 8%.
162
Quality of tests
A ■ The power function of a test is the function of the parameter θ which gives the probability that
the test statistic will fall in the critical region of the test if θ is the true value of the parameter.
A power function enables you to calculate the power of the test for any given value of θ, and thus to
plot a graph of power against θ.
Example 11
Past experience has shown that the number of accidents that take place at a road junction has a
Poisson distribution with an average of 3.5 accidents per month. A trading estate is built along
one of the roads leading away from the junction and the local council is anxious that this may have
increased the accident rate. To see if the number of accidents had increased, a test was set up with
the null hypothesis H0 : λ = 3.5 and with the alternative hypothesis being accepted if the number of
accidents X within the first month after the alteration was > 7.
a Find the size of the test.
b Find the power function for the test and sketch the graph of the power function.
a Size of test = P(reject H0 when it is true) You can use conditional probability
= P(X > 7|X ∼ Po(3.5)) notation to write your assumptions
= 1 − 0.9347 = 0.0653 quickly.
b Power function = P(reject H0 when it is false)
= P(X > 7|X ∼ Po(λ))
Problem-solving
= 1 − P(X < 6|X ∼ Po(λ))
= 1 − e−λ(1 + λ + ___ + ___ + ___ + ____ + _____)
λ2 λ 3 λ4 λ5 λ6 You do not know the value of λ. Your
2 6 24 120 720 power function will be given in terms
This enables values of the power of the test to be of this unknown parameter.
calculated for different values of λ.
λ = 4 gives power = 0.1107
Use the finite sum of the
λ = 5 gives power = 0.2378
λ = 6 gives power = 0.3937 probabilities for X = 1, 2, 3, 4, 5, 6 to
λ = 7 gives power = 0.5503 find the power function.
λ = 8 gives power = 0.6866
λ = 9 gives power = 0.7932 O!en in an examination a partially
λ = 10 gives power = 0.8699
completed table will be given.
The graph is as shown below.
Power
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
3 4 5 6 7 8 9 10 λ
163
Chapter 8
A Power functions are particularly useful when comparing two different tests.
■ When comparing two tests of comparable size, you should recommend the test with the
higher power within the likely range of the parameter.
Example 12
A manufacturer of sweets supplies a mixed assortment of chocolates in a jar. He claims that 40%
of the chocolates have a ‘hard centre’, the remainder being ‘soft centred’.
A shopkeeper does not believe the manufacturer’s claim and proposes to test it using the following
hypotheses.
H0 : p = 0.4 H1 : p < 0.4
where p is the proportion of ‘hard centres’ in the jar. Two tests are proposed.
In test A he takes a random sample of 10 chocolates from the jar and rejects H0 if the number of
‘hard centres’ is less than 2.
a Find the size of test A.
b Show that the power function of test A is given by
(1 − p)10 + 10p(1 − p)9.
In test B he takes a random sample of 5 chocolates from the jar and if there are no ‘hard centres’
he rejects H0, otherwise he takes a second sample of 5 chocolates and H0 is rejected if there are no
further ‘hard centres’ on this second occasion.
c Find the size of test B.
d Find an expression for the power function of test B.
The powers for test A and test B for various values of p are given in the table.
p 0.1 0.2 0.25 0.3 0.35
Power for test A 0.74 r 0.24 s 0.09
Power for test B 0.83 0.54 0.42 0.31 0.22
164
Quality of tests
A Problem-solving
d Power of test B = P(0 hard centres in first 5)
+ P(0 hard centres in second 5 and > 0 hard centres Consider the conditions that are
in first 5) necessary for H0 to be rejected:
= P(X = 0|p) + (1 − P(X = 0|p)) × P(X = 0|p) First sample Second sample
= (1 − p)5 + (1 − (1 − p)5)(1 − p)5 X = 0 and H0 rejected
= (1 − p)5 (1 + 1 − (1 − p)5)
= (1 − p)5 (2 − (1 − p)5) X = 0 and H0 rejected
= 2(1 − p)5 − (1 − p)10 X>0
X>0
e Test A: p = 0.2 Power = (1 − 0.2)10 + 10(0.2)(1 − 0.2)9
= 0.38
so r = 0.38
p = 0.3 Power = (1 − 0.3)10 + 10(0.3)(1 − 0.3)9
so s = 0.15
The reason for the final comment
f Power for test B > Power for test A for all the given should be based upon the
values of p, so he should use test B. calculations of the power.
Example 13
A local park believes the fox population in the area has decreased. They want to test for the
probability, p, that a fox will be observed on any given day. They count the number of days, X,
that pass until the first observation of a fox. They test H0: p = 0.1 against H1: p < 0.1 and reject H0
if X > 30.
a Find the size of this test.
b Find the power function for the test.
Exercise 8D
E/P 1 A single observation x is taken from a Poisson distribution with parameter λ. This observation is
to be used to test H0: λ = 6.5 against H1: λ , 6.5. The critical region chosen was x < 2.
a Find the size of the test. (4 marks)
b Show that the power function of this test is given by
e−λ(1 + λ + _2 λ2)
1
(3 marks)
The table gives the value of the power function to two decimal places.
λ 1 2 3 4 5 6
Power 0.92 s 0.42 0.24 t 0.06
165
Chapter 8
E/P 2 In a binomial experiment consisting of 12 trials, X represents the number of successes and p the
probability of a success.
In a test of H0: p = 0.45 against H1: p < 0.45 the null hypothesis is rejected if the number of
successes is 2 or less.
a Find the size of this test. (4 marks)
b Show that the power function for this test is given by
(1 − p)12 + 12p(1 − p)11 + 66p2(1 − p)10 (3 marks)
c Find the power of this test when p is 0.3. (1 mark)
3 In a binomial experiment consisting of 10 trials, the random variable X represents the number of
successes and p the probability of a success.
In a test of H0: p = 0.4 against H1: p . 0.4, a critical region of x > 8 was used.
Find the power of this test when:
a p = 0.5
b p = 0.8.
c Comment on your results.
E/P 4 A certain gambler always calls heads when a coin is spun. Before he uses a coin he tests it to see
whether or not it is fair and uses the following hypotheses:
H0: p = _2 H1: p < _2
1 1
where p is the probability that the coin lands heads on a particular spin. Two tests are proposed.
In test A the coin is spun 10 times and H0 is rejected if the number of heads is 2 or fewer.
a Find the size of test A. (4 marks)
b Explain why the power of test A is given by
(1 − p)10 + 10p(1 − p)9 + 45p2(1 − p)8 (3 marks)
In test B the coin is first spun 5 times. If no heads result, H0 is immediately rejected. Otherwise
the coin is spun a further 5 times and H0 is rejected if no heads appear on this second occasion.
c Find the size of test B. (4 marks)
d Find an expression for the power of test B in terms of p. (3 marks)
The power for test A and the power for test B are given in the table for various values of p.
p 0.1 0.2 0.25 0.3 0.35 0.4
Power for test A 0.9298 0.6778 0.3828 0.1673
Power for test B 0.8323 0.5480 0.4183 0.3079 0.2186 0.1495
e Find the power for test A when p is 0.25 and 0.35. (2 marks)
f Giving a reason, advise the gambler about which test he should use. (1 mark)
166
Quality of tests
A 5 In an experiment the probability of success in each trial is constant, and the random variable X
represents the number of trials needed to get one success. A test of H0: p = 0.15 against
H1: p < 0.15 with a 1% significance level is used.
a Find the size of the test.
b Find the power function.
E/P 6 A cyclist uses new tyres every time he does a time trial. He has found that on one specific route
he has a probability of 0.9 of not getting a flat tyre. After changing tyre brands he believes that
the new tyres are more resistant, and decides to perform a test, with 5% significance level, by
doing 10 trials on the route and seeing how many times he would complete it without a flat tyre.
a Find the size of the test. (4 marks)
b Find the power function of the test. (2 marks)
c Find the power function for the test if, instead of 10 trials, he had done 12 trials. (5 marks)
d Given that the probability of completing the trial without a flat tyre with the new brand is
0.95, calculate which number of trials gives a more accurate test result. (3 marks)
Mixed exercise 8
E 2 The random variable X has a Poisson distribution. A sample is taken and it is desired to test
H0: λ = 3.5 against H1: λ < 3.5 using a 5% significance level.
a Find the critical region for this test. (4 marks)
b State the probability of committing a Type I error for this test. (2 marks)
Given that the true value of λ is 3.0,
c find the power of this test. (2 marks)
E 3 The random variable X ∼ N(μ, 9). A random sample of 18 observations is taken, and it is desired
to test H0: μ = 8 against H1: μ ≠ 8, at the 5% significance level. The test statistic to be used is
X−μ
Z = _____
σ___
__
n
√
167
Chapter 8
A 4 A bird observatory wishes to test whether the migration rate of geese has changed from that of
E/P 10 per day. First they take note of how many geese are observed flying in a migratory pattern
on a specific day. If the number of geese migrating is greater than or equal to 4 and less than or
equal to 17, then they conclude that the rate has not changed. If they observe 3 or fewer geese,
then on the following day they conduct further observations, and if they observe 2 or fewer
geese they conclude that the rate has decreased, otherwise they conclude that it hasn’t changed.
If on the first day they observe 18 or more geese migrating, then on the following day they also
conduct further observations and if they observe 19 or more geese migrating they conclude that
the rate has increased, otherwise they conclude that it has not changed.
a Find the size of the test. (4 marks)
Given that the migration rate of the geese actually dropped to 5 per day,
b find the power of the test. (6 marks)
E 5 A single observation, x, is taken from a Poisson distribution with parameter λ. The observation
is used to test H0: λ = 4.5 against H1: λ . 4.5. The critical region chosen for this test was x > 8.
a Find the size of this test. (4 marks)
b The table gives the power of the test for different values of λ.
λ 1 2 3 4 5 6 7 8 9 10
Power 0 0.0011 0.0119 r 0.1334 s 0.4013 0.5470 t 0.7798
E 6 In a binomial experiment consisting of 15 trials, X represents the number of successes and p the
probability of success.
In a test of H0: p = 0.45 against H1: p , 0.45 the critical region for the test was X < 3.
a Find the size of the test. (4 marks)
b Use the binomial cumulative distribution function to complete the table given below.
(3 marks)
p 0.1 0.2 0.3 0.4 0.5
Power 0.944 s 0.2969 t 0.0176
c Draw the graph of the power function for this test. (1 mark)
E 7 A company buys rope from Bindings Ltd and it is known that the number of faults per 100 m
of their rope follows a Poisson distribution with mean 2. The company is offered 100 m of rope
by Tieup, a newly established rope manufacturer. The company is concerned that the rope from
Tieup might be of poor quality.
a Write down the null and alternative hypotheses appropriate for testing that rope from
Tieup is in fact as reliable as that from Bindings Ltd. (1 mark)
b Derive a critical region to test your null hypothesis with a size of approximately 0.05.
(4 marks)
c Calculate the power of this test if rope from Tieup contains an average of 4 faults
per 100 m. (3 marks)
168
Quality of tests
A 8 The number of faulty garments produced per day by machinists in a clothing factory has
E/P
a Poisson distribution with mean 2. A new machinist is trained and the number of faulty
garments made in one day by the new machinist is counted.
a Write down the appropriate null and alternative hypotheses involved in testing the theory
that the new machinist is less reliable than the other machinists. (1 mark)
b Derive a critical region, of size approximately 0.05, to test the null hypothesis. (4 marks)
c Calculate the power of this test if the new machinist produces an average of 3 faulty
garments per day. (3 marks)
The number of faulty garments produced by the new machinist over three randomly selected
days is counted.
d Derive a critical region, of approximately the same size as in part b, to test the null
hypothesis. (2 marks)
e Calculate the power of this test if the machinist produces an average of 3 faulty garments
per day. (3 marks)
f Comment briefly on the difference between the two tests. (1 mark)
E/P 10 A proportion p of the items produced by a laboratory are defective. A technician selects a
random sample of 10 items from each batch produced to check whether or not there is evidence
that p > 0.10. The criterion that the technician uses for rejecting the hypothesis that p is 0.10 is
that there are more than 4 defective items in the sample.
a Find the size of the test. (2 marks)
The table gives some values, to 2 decimal places, of the power function of this test.
p 0.15 0.20 0.25 0.30 0.35 0.40
Power 0.01 0.03 u 0.15 0.25 0.37
169
Chapter 8
E/P 11 Accidents on a stretch of motorway occur at an average rate of λ per week. A road safety
officer takes a random sample of 10 weeks to test whether or not there is evidence that λ > 0.3.
The criterion that the officer uses for rejecting the hypothesis that λ = 0.3 is that there are more
than 5 accidents in the sample.
a Find the size of the test. (2 marks)
The table gives some values, to 2 decimal places, of the power function of this test.
λ 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Power 0.21 a 0.55 0.70 0.81 0.88 0.93
170
Quality of tests
Challenge
A Jane and Emma decide to test a pair of dice from a new board game.
They suspect that at least one of them has probability higher than _16 of
showing the value one.
Jane decides to throw both dice 12 times. If a pair of ones appears 2 or
more times, she concludes that at least one of the dice is biased.
a Find the size of Jane’s test.
b Express the power of Jane’s test in terms of the parameter p, which
represents the probability of obtaining a pair of ones.
Emma decides to throw one dice 6 times. If the value one appears 4 or
more times she concludes that the dice is biased. If it appears fewer
than 4 times, then she throws the other dice 6 times, and concludes that
the second dice is biased if the value one appears 4 or more times.
c Find the size of Emma’s test.
Now assume that one of the dice is fair, and let q be the probability of
obtaining the value one on the other dice.
d Show that the power of Jane’s test is given by the expression
1 − (1 − __) − 2q (1 − __)
q 12 q 11
6 6
e Show that the power of Emma’s test is given by the expression
0.0087 + 14.8695q4 − 23.7912q5 + 9.913q6
Below is a graph of the power function for Emma’s test.
0.2
0.15
Power
0.1
0.05
–0.05O 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 q
f By using a table of values, draw the graph of the power function for
Jane’s test.
g Given that the parameter q lies between 0.1 and 0.4, explain, giving
your reasons, which test you would recommend.
171
Chapter 8
3 When a continuous distribution such as the normal distribution is used then P(Type I error) is
equal to the significance level of the test.
4 The size of a test is the probability of rejecting the null hypothesis when it is in fact true and
this is equal to the probability of a Type I error.
5 The power of a test is the probability of rejecting the null hypothesis when it is not true.
Power = 1 − P(Type II error) = P(being in the critical region when H0 is false)
6 The power function of a test is the function of the parameter θ which gives the probability
that the test statistic will fall in the critical region of the test if θ is the true value of the
parameter.
7 When comparing two tests of comparable size, you should recommend the test with the higher
power within the likely range of the parameter.
172