0% found this document useful (0 votes)
15 views

1.8*_Quality of tests

This document discusses the concepts of Type I and Type II errors in hypothesis testing, including their definitions and implications. It provides examples of how to calculate these errors using different statistical distributions and significance levels. The chapter also emphasizes the importance of understanding these errors when evaluating the reliability of hypothesis tests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

1.8*_Quality of tests

This document discusses the concepts of Type I and Type II errors in hypothesis testing, including their definitions and implications. It provides examples of how to calculate these errors using different statistical distributions and significance levels. The chapter also emphasizes the importance of understanding these errors when evaluating the reliability of hypothesis tests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

8 Quality of tests

Objectives
A!er completing this chapter you should be able to:
● Know about Type I and Type II errors → pages 147–153

● Find Type I and Type II errors using the normal distribution


→ pages 153–157

● Calculate the size and power of a test → pages 157–162

● Draw a graph of the power function for a test → pages 162–167

Prior knowledge check


1 Daily mean temperature in a UK town is modelled as
X ∼ N(µ, 2.32).
The mean of a random sample of 20 recorded mean
daily temperatures taken in 2015 is 11.1 °C. Test
whether µ is greater than 10 °C at the 1% level of
significance. State your hypotheses clearly.
← Statistics and Mechanics Year 2, Chapter 3
Hypothesis tests can sometimes
lead to incorrect conclusions. You
2 A single observation is taken from each distribution
can analyse hypothesis tests to
and used to test H0 against H1. Find the critical
work out how reliable they are.
region for each test.
This information is very important
a X ∼ Po(λ), H0: λ = 5 against H1: λ ≠ 5 using a 10% when using hypothesis testing to
level of significance. ← Sections 5.1, 5.2 determine the efficacy of new drugs
b Y ∼ Geo( p), H0: p = 0.15 against H1: p > 0.15 using and medical procedures.
a 5% level of significance. ← Sections 5.3, 5.4 → Exercise 8B, Q5

146
Quality of tests

8.1 Type I and Type II errors


A
When you carry out a hypothesis test, you make an assumption about the distribution of a test
statistic. You then compare the probability of the observed result occurring with the significance
level of the test, and decide whether to accept or reject this assumption. This example illustrates a
hypothesis test based on the parameter, p, of a binomial distribution.

Example 1

One rainy day during the summer holidays, a family of four were playing a simple game of cards.
The game was one of chance so the probability of any particular person winning should have
been _4 . After playing a number of games, Robert complained that his younger sister Sarah must
1

have been cheating as she kept winning. Their parents quickly intervened and decided to carry out
a proper investigation and carefully watched the next 20 games.
Find the critical region for a one-tailed test using a 5% level of significance.

H0 : p = _41 H1 : p > _41 If Sarah is cheating then you would expect the
proportion of games she wins to be more than _14 .
Let X = the number of games Sarah wins out
of the next 20.
State the distribution of the statistic assuming H0
So X ∼ B(20, _41 )
is true.
Reject H0 if X > c where P(X > c) , 0.05.
From tables: Use tables to find the smallest value of c with
P(X < 8) = 0.9591 so P(X > 9) = 0.0409 P(X > c) , 0.05

P(X < 7) = 0.8982 so P(X > 8) = 0.1018


This is the case with the smallest value of c.
So the critical region is X > 9. Sometimes 0.0409 is called the actual
significance level.

In the example above, if Sarah wins 9 or more games, her parents will reject the null hypothesis, and
conclude that p > _14 (or in other words, that Sarah was cheating). It is possible that this conclusion
will be incorrect. If p = _14 , Sarah might still win 9 or more games by chance. The probability of this
occurring is 0.0409, or the actual significance level of the test. This is called a Type I error.

■ A Type I error is when you reject H0, but H0 is in fact true. The probability of a Type I error is
the same as the actual significance level of the hypothesis test.

It is also possible that Sarah was cheating, but that she still only wins 8 or fewer games. In this case
her parents would accept the null hypothesis, and conclude incorrectly that p = _4. This is called a
1

Type II error.

■ A Type II error is when you accept H0, but H0 is in fact false.


Watch out In order to calculate the probability of a Type II error
you would need to know the actual value of the parameter p.
Because H0 is false, you usually don’t have this information.

147
Chapter 8

A This table summarises the types of error that can occur in a hypothesis test:
Truth
H0 is true H0 is false
Accept H0 OK Type II error
Conclusion of test
Reject H0 Type I error OK

Example 2

Use the situation in Example 1.


a Find the probability of a Type I error.
b If in fact Sarah was cheating and p = 0.35, find the probability of a Type II error.

From Example 1 state the hypotheses and critical


a H0: p = __1
4 H1: p > __
1
4
region.
Critical region X > 9
P(Type I error) = P(rejecting H0 when H0 If H0 is true p = 0.25.
is true)
= P(X > 9|X ∼ B(20, 0.25)) Use the binomial cumulative distribution function
= 0.0409 on your calculator.
P(X > 9) = 1 − P(X < 8)
= 1 − 0.95907…
Notice that this is different from the nominal
b P(Type II error) = P(accepting H0 when H0 significance level of 5%.
is false)
= P(X < 8|H0 is false) To accept H0 you need X < 8.
Given that p = 0.35,
P(Type II error) = P(X < 8|X ∼ B(20, 0.35)) The statement ‘H0 is false’ does not provide a
= 0.7624 value for p so in examples of this sort a value of p
is usually given.

Example 3

Accidents occurred on a stretch of motorway at an average rate of 6 per month. Many of the
accidents that occurred involved vehicles skidding into the back of other vehicles. By way of a trial,
a new type of road surface that is said to reduce the risk of vehicles skidding is laid on this stretch
of road, and during the first month of operation 4 accidents occurred.
a Test this result to see if it gives evidence that there has been an improvement at the 5% level of
significance.
b Calculate P(Type I error) for this test.
c If the true average rate of accidents occurring with the new type of road surface was 3.5,
calculate the probability of a Type II error.

148
Quality of tests

A
a You are dealing with a Poisson distribution.
Let λ = the average number of accidents in a month, and
X = the number of accidents in any given month, then the
hypotheses are
H0 : λ = 6 (i.e. no change) Part a is a hypothesis test for the
H1 : λ < 6 (i.e. fewer accidents) mean of a Poisson distribution.
From tables P(X < 4|λ = 6) = 0.2851. ← Section 5.1

This is more than 5% so you do not have enough


evidence to reject H0.
Since it is a one-tailed test the
The average number of accidents per month has not
conclusion should be clearly one-
decreased.
tailed.
b In order to reject H0 you require a value c such that
P(X < c|λ = 6) < 0.05 You could have specified as close as
possible to 5%.
From the table on page 191, with λ = 6:
P(X < 2) = 0.0620
and P(X < 1) = 0.0174.
So the critical value c is 1, and the critical region for
this test is X < 1.
A Type I error occurs when you reject H0 when it is
true, and the probability of this happening is
This is again smaller than the 5%
P(X < 1) = 0.0174.
you were aiming for.
c A Type II error occurs when you do not have sufficient
evidence to reject H0 when H1 is true.
If λ = 3.5 then H0 is not true. You do not have sufficient
evidence to reject H0 if X > 2 so
P(Type II error|λ = 3.5) = P(X > 2|λ = 3.5)
= 1 − P(X < 1|λ = 3.5)
= 1 − 0.1359
= 0.8641

You can also calculate the probabilities of errors from a two-tailed hypothesis test.

Example 4
A coin is spun 20 times and a head is obtained on 7 occasions.
a Test to see whether or not the coin is biased.
b Calculate the probability of a Type I error for this test.
c Given that the coin is biased and that this bias causes the tail to appear 3 times for each head
that appears, calculate the probability of a Type II error for the test.

149
Chapter 8

A
a The hypotheses are This is a test for the proportion of a binomial
distribution, and since you are testing to see if
H0 : p = 0.5 H1 : p ≠ 0.5
the coin is biased in either direction, a two-tailed
Let X = the number of heads in 20 spins test has to be used.
of the coin.
Assuming H0 is true then X ∼ B(20, 0.5).
For a two-tailed test, at the 5% significance
level, you require values c1 and c2 so that The critical region will be in two parts.
P(X < c1) < 0.025 and P(X > c2) < 0.025
(or P(X < c2 − 1) > 0.975).
From tables: P(X < 6) = 0.0577
and P(X < 5) = 0.0207
so the value of c1 = 5.
Also: P(X > 14) = 1 − P(X < 13)
= 1 − 0.9423
Alternatively P(X < 13) = 0.9423 and
= 0.0577
P(X < 14) = 0.9793 so c2 − 1 = 14 and c2 = 15.
P(X > 15) = 1 − P(X < 14)
= 1 − 0.9793
= 0.0207
so the value of c2 = 15.
Thus the critical region for X is X < 5 or Problem-solving
X > 15.
Notice that since p = 0.5 the two tails are
As 7 falls between 5 and 15 there is symmetrical about the mean of 10 and the value
insufficient evidence to reject H0. of c2 could have been inferred from that of c1 in
The coin is not biased. this case.
b A Type I error occurs when you reject H0
but H0 is true, and this occurs when X < 5
In this case there are two probabilities to be
or X > 15.
found and added.
P(Type I error) = P(X < 5|p = 0.5)
+ P(X > 15|p = 0.5)
= 0.0207 + 0.0207
= 0.0414

c A Type II error occurs when you do not


have sufficient evidence to reject H0 when
H1 is true. You do not have evidence to
reject H0 if X > 6 and X < 14
i.e 6 < X < 14. Remember that
X = the number of heads and
P(Type II error) = P(6 < X < 14|p = 0.25)
p = the probability of getting a head.
= P(X < 14|p = 0.25) In this case p = 0.25.
− P(X < 5|p = 0.25)
= 1.000 − 0.6172
= 0.3828

150
Quality of tests

Example 5
A
Jane knows from experience that 10% of the emails she receives are spam. After her email service
upgraded the spam filters, she recorded the number of emails sent up to and including the first spam
email. She wants to test, at the 5% significance level, whether this upgrade improved the spam filter.
a Find the critical region for her test.
b Calculate the probability of a Type I error for this test.
c Given that after the upgrade the probability of an email she receives being spam is now 1 in a
100, calculate the probability of a Type II error for the test.

a Let X = number of emails sent up to and


This is a hypothesis test for the parameter of a
including first spam email
geometric distribution. Start by defining your
X ∼ Geo( p)
test statistic and stating your null and alternative
H0: p = 0.1 H1: p < 0.1 hypotheses. ← Section 5.4
Assume H0, so that X ∼ Geo(0.1).
For a one-tailed test you need to find a
value c so that P(X > c) < 0.05.
For a geometric distribution X ∼ Geo( p)
Since P(X > c) = (1 − 0.1)c − 1 = 0.9c − 1,
P(X > n) = (1 − p)n − 1 ← Section 3.1
you need an integer c such that
0.9c − 1 < 0.05
log 0.9c − 1 < log 0.05 Take logs of both sides.
(c − 1)log 0.9 < log 0.05
c log 0.9 < log 0.05 + log 0.9 Watch out log 0.9 is negative, so change the
log 0.05 + log 0.9 direction of the inequality when you divide.
c > ______________
log 0.9
c > 29.4 (3 s.f.)
So the critical value is c = 30, and the
critical region is X > 30. Choose the next integer value above 29.4.
b A Type I error occurs when you reject H0
but H0 is true, and this happens when
X > 30.
P(Type I error) = P(X > 30 | p = 0.1)
= (1 − 0.1)30 − 1
= 0.929
= 0.0471 (4 d.p.)
c A Type II error occurs when you don’t have
enough evidence to reject H0 and H1 is
true. You do not have enough evidence to
reject H0 when X < 29.
P(Type II error) = P(X < 29 | p = 0.01)
= 1 − (1 − 0.01)29 P(X < n) = 1 − (1 − p)n ← Section 3.1
= 1 − 0.9929
= 0.2528 (4 d.p.)

151
Chapter 8

Exercise 8A
A
1 The random variable X is binomially distributed. A sample of 10 is taken, and it is desired to
test H0 : p = 0.25 against H1 : p > 0.25, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true value of p was
later found to be 0.30, calculate the probability of a Type II error.

2 The random variable X is binomially distributed. A sample of 20 is taken, and it is desired to


test H0 : p = 0.30 against H1 : p < 0.30, using a 1% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
later found to be 0.25, calculate the probability of a Type II error.

3 The random variable X is binomially distributed. A sample of 10 is taken, and it is desired to


test H0 : p = 0.45 against H1 : p ≠ 0.45, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
later found to be 0.40, calculate the probability of a Type II error.

4 The random variable X has a Poisson distribution. A sample is taken, and it is desired to test
H0 : λ = 6 against H1 : λ > 6, using a 5% level of significance.
a Find the critical region for this test.
b Calculate the probability of a Type I error and, given that the true value of λ was later found
to be 7, calculate the probability of a Type II error.

5 The random variable X has a Poisson distribution. A sample is taken, and it is desired to test
H0 : λ = 4.5 against H1 : λ < 4.5, using a 5% level of significance.
a Find the critical region for this test.
b Calculate the probability of a Type I error and, given that the true value of λ was later found
to be 3.5, calculate the probability of a Type II error.

6 The random variable X has a Poisson distribution. A sample is taken, and it is desired to test
H0 : λ = 9 against H1 : λ ≠ 9, using a 5% level of significance.
a Find the critical region for this test.
b Calculate the probability of a Type I error and, given that the true value of λ was later found
to be 8, calculate the probability of a Type II error.

7 The random variable X is geometrically distributed, and it is desired to test H0 : p = 0.2 against
H1 : p < 0.2, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
found to be p = 0.05, calculate the probability of a Type II error.

152
Quality of tests

A 8 The random variable X is geometrically distributed, and it is desired to test H0 : p = 0.02 against
H1 : p < 0.02, using a 1% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
found to be p = 0.01, calculate the probability of a Type II error.

9 The random variable X is geometrically distributed, and it is desired to test H0 : p = 0.01 against
H1 : p ≠ 0.01, using a 5% level of significance.
a Calculate the critical region for this test.
b State the probability of a Type I error for this test and, given that the true probability was
found to be p = 0.1, calculate the probability of a Type II error.

E/P 10 a Define:
i a Type I error (1 mark)
ii a Type II error. (1 mark)
The discrete random variable X ∼ Geo( p). You wish to test H0 : p = 0.004 against H1 : p ≠ 0.004,
using a 10% significance level. The probability in each tail should be as close to 0.05 as possible.
b Find the critical region for this test. (7 marks)
c State the probability of a Type I error occurring for this test. (1 mark)

E/P 11 Michael has bought a dice with 20 sides, and his friend David suspects that it is landing on 17
more often than it is landing on the other values. They both decide to test this in two different
ways, using a 5% significance level. Michael throws the dice 40 times and records the number of
times the dice lands on the 17.
a Find the critical region for Michael’s test. (4 marks)
b State the probability of a Type I error occurring for Michael’s test. (1 mark)
David decides to throw the dice until the first time it lands on 17.
c Find the critical region for David’s test. (4 marks)
d State the probability of a Type I error occurring. (1 mark)
The actual probability of the dice landing on 17 is 0.0588.
e Calculate the probability of a Type II error occurring in David’s test. (2 marks)
f Calculate the probability of a Type II error occurring in Michael’s test. (2 marks)

8.2 Finding Type I and Type II errors using the normal distribution
You need to be able to find Type I and Type II errors using the normal distribution.

Links If you are carrying out a hypothesis test for


the mean of a normal distribution, you will be given
the value for the population standard deviation, σ or
variance, σ 2. The sample variance for a sample of
σ2
size n will be ___
n.
← Statistics and Mechanics Year 2, Section 3.7

153
Chapter 8

A In the examples in the previous section P(Type I error), which gives the actual significance level, was
not equal to the target significance level. This was due to the discrete nature of the distributions used.
■ When a continuous distribution such as the normal distribution is used then P(Type I error)
is equal to the significance level of the test.

Example 6
Bags of sugar having a nominal weight of 1 kg are filled by a machine. From past experience it is
known that the weight, X kg, of sugar in the bags is normally distributed with a standard deviation
of 0.04 kg. At the beginning of each week a random sample of 10 bags is taken in order to see if
the machine needs to be reset. A test is then done at the 5% significance level with
H0 : μ = 1.00 kg and H1 : μ ≠ 1.00 kg. Find:
Online Explore
a the critical region for this test
probabilities of Type I
b P(Type I error) for this test. and Type II errors in a normal
Assuming that the mean weight has in fact changed to 1.02 kg, distribution using GeoGebra.
c find P(Type II error) for this test.
_
a The distribution of X is modelled by Since this is a two-tailed test you allow 2.5% at

N(1.0, ______).
0.042 each tail.
10
From the tables the critical region for Z is The critical region is found by rearranging

Z > 1.96 or Z < −1.96.


_
| |
x¯ − µ
σ__ > 1.96 for µ = 1.0, σ = 0.04 and n = 10.
_____
___
√n
The critical values for X are given by
______


0.042
______ Notice once again that the critical region is in two
x¯ = 1 ± 1.96 ×
10 parts.

= 0.9752 and 1.0248


_
The
_ critical region is X < 0.9752 and
X > 1.0248. Type I error Type I error

b P(Type I error) for this test will be the 0.9752 1.0 1.0248 X
same as the significance level = 0.05.

c The area required


_ for a Type
_ II error lies Type II error
between X = 0.9752 and X = 1.0248

given that X is modelled by N(1.02, ______).


_ 0.042
10
The probability of a Type II error is 0.9752 1.02 X
given by 1.0248
_
P(0.9752 < X < 1.0248) = 0.6476
Use the normal cumulative distribution function
_____

√ 0.042
on your calculator, with σ = _____ = 0.01249 …
10

154
Quality of tests

A When carrying out hypothesis tests, you want to keep P(Type I error) and P(Type II error) as low as
possible. The following example illustrates the relationship between Type I and Type II errors.

Example 7
The weight of jam in a jar, measured in grams, is distributed normally with a mean of 150 g and
a standard deviation of 6 g. The production process occasionally leads to a change in the mean
weight of jam per jar but the standard deviation remains unaltered.
The manager monitors the production process and for every new batch takes a random sample of 25
jars and weighs their contents to see if there has been any reduction in the mean weight of jam per jar.
_
Find the critical values for the test statistic X , the mean weight of jam in a sample of 25 jars, using:
a a 5% level of significance
b a 1% level of significance.
Given that the true value of μ for the new batch is in fact 147,
c find the probability of a Type II error for each of the above critical regions.

a H0 : µ = 150 State the hypotheses to define the test. You are


H1: µ < 150 (i.e. a one-tailed test) looking for a ‘reduction’ in the mean so a one-
_
X ∼ N(150, ___), n = 25 and σ = 6
62 tailed test is needed.
25
The 5% critical region for Z is
Z < −1.6449 so reject H0 if The critical value for Z is found from tables.
_
X − 150
________
___
< −1.6449
6
___ Note that P(Type I error) in each case is the same
√ 25
_ as the significance level for the test.
That is, the critical region for X is
_
X < ___ 6
___
√ 25
× (−1.6449) + 150
_
so X < 148.02612.
b The 1% critical region for Z is
Z < −2.3263 so reject H0 if
_
X − 150
________
___
6
< −2.3263
___
√ 25
_
That is, the critical region for X is
_
X < ___ 6
___
√ 25
× (−2.3263) + 150
_
so X < 147.20844.
c 5% test
_ P(Type II error)
= P(X > 148.026… | µ = 147) ___

= 0.1963 (4 d.p.) √
62
Use your calculator with σ = ___ = 1.2.
25
1% test
_ P(Type II error)
= P(X > 147.2084 | µ = 147)
= 0.4311 (4 d.p.)

155
Chapter 8

A Notice how in this example if we try to reduce P(Type I error) from 5% to 1% then P(Type II error)
increases from 0.1963 to 0.4311. A more detailed study of the interplay between these two
probabilities follows later in this chapter. However, you should be aware of this phenomenon and
appreciate one of the reasons why we do not always use a significance level that is very small.
The value of 5% is a commonly used level and, in a situation where a particular significance level is
not given, this value is recommended.
This does not mean that other significance levels are never used. When, for example, the results of
the research are highly important and making a Type I error could be very serious, a 1% significance
level might be used. In other cases a significance level of 10% might be used. An alternative method
of reducing the probability of a Type II error is to increase the sample size but this can increase the
cost or duration of a survey or experiment.
The relationship between the probabilities of Type I and Type II errors can be illustrated by imagining
pushing down on one side of a balloon.

P(Type I error) P(Type II error)

P(Type II error) P(Type I error)

The only way to push down on both sides at once (and reduce the overall thickness) is to allow the
air to move sideways. Using a larger balloon would allow you to reduce the overall thickness (this is
equivalent to increasing the size of the sample n).

Exercise 8B

1 The random variable X ∼ N(μ, 32). A random sample of 20 observations of X is taken, and the
sample mean x¯ is taken to be the test statistic. It is desired to test H0 : μ 5 50 against
H1 : μ > 50, using a 1% level of significance.
a Find the critical region for this test.
b State the probability of a Type I error for this test.
Given that the true mean was later found to be 53,
c find the probability of a Type II error.

2 The random variable X ∼ N(μ, 22). A random sample of 16 observations of X is taken, and the
sample mean x¯ is taken to be the test statistic. It is desired to test H0 : μ 5 30 against
H1 : μ < 30, using a 5% level of significance.
a Find the critical region for this test.
b State the probability of a Type I error for this test.
Given that the true mean was later found to be 28.5,
c find the probability of a Type II error.

156
Quality of tests

A 3 The random variable X ∼ N(μ, 4 ). A random sample of 25 observations of X is taken, and the
2

sample mean x¯ is taken to be the test statistic. It is desired to test H0 : μ 5 40 against


H1 : μ ≠ 40, using a 1% level of significance.
a Find the critical region for this test.
b State the probability of a Type I error.
Given that the true mean was later found to be 42,
c find the probability of a Type II error.

E 4 A manufacturer claims that the average outside diameter of a particular washer produced by his
factory is 15 mm. The diameter is assumed to be normally distributed with a standard deviation
of 1 mm. The manufacturer decides to take a random sample of 25 washers from each day’s
production in order to monitor any changes in the mean diameter.
a Using a significance level of 5%, find the critical region to be used for this test. (4 marks)
Given that the average diameter had in fact increased to 15.6 mm,
b find the probability that the day’s production would be wrongly accepted. (2 marks)

E/P 5 The number of patients that a medic can inoculate with a vaccine in one day can be modelled by
a normal distribution with mean 40 and standard deviation 8. The manufacturer of the vaccine
claims that a new method of inoculation will speed up the rate at which the medic works.
A random sample of 30 medics tried _ out the new method of inoculation and the average number
of patients they dealt with per day X was recorded.
_
a Using a 5% significance level, find the critical value of X . (4 marks)
The average number of patients dealt with per day using the new method of inoculation was in
fact 42.
b Find the probability of making a Type II error. (2 marks)
The manufacturer of the vaccine would like to lessen the probability of a Type II error being
made and recommends that the significance level be changed.
c State, giving a reason, what recommendation you would make. (1 mark)

8.3 Calculate the size and power of a test


You need to be able to calculate the size and power of a test.
You have already seen that a Type I error occurs when the null hypothesis is rejected when it is in fact
true. The probability of a Type I error will be written as α and is o!en known as the size of the test.
■ The size of a test is the probability of rejecting the null hypothesis when it is in fact true and
this is equal to the probability of a Type I error.
The size of a test, as you have seen, is the actual significance level of the test and this is usually
chosen before the test is carried out.
When conducting a hypothesis test you should also be interested in the probability of rejecting the
null hypothesis when it is in fact untrue, as this is clearly a desirable feature of a test. The probability
of rejecting the null hypothesis H0 when it is untrue, is known as the power of the test.

157
Chapter 8

A ■ The power of a test is the probability of rejecting the null hypothesis when it is not true.
Power = 1 − P(Type II error)
= P(being in the critical region when H0 is false)
The greater the power of a test, the greater the probability of rejecting H0 when H0 is false. It follows
that the higher the power, the better the test.
The table on page 148 can now be rewritten to show the probabilities for the different situations.
Truth
H0 is true H0 is false
Accept H0 OK P(Type II error)
Conclusion of test
Reject H0 Size = P(Type I error) Power = 1 − (Type II error)

The size and power both relate to rejecting H0.


The size relates to when H0 is true and a Type I error has been made.
The power relates to when H0 is false and a correct decision was made.
If the power is greater than 0.5, the probability of coming to the correct conclusion (rejecting H0 when
H0 is false) is greater than the probability of coming to the wrong conclusion (accepting H0 when H0 is
false).
On page 156 you were told that, generally, if you increase the sample size the probability of a Type II
error decreases. It follows that the larger the sample size, the greater the power of the test. Increasing
the sample size is preferable to increasing the significance level as a way of increasing the power of a
test.

Example 8
The random variable X has a binomial distribution. A random sample of size 25 was taken to test
H0 : p = 0.30 against H1 : p < 0.30 using a 10% level of significance.
a Find the critical region for this test.
b Find the size of this test.
Given that p = 0.20,
c calculate the power of this test.

a X ∼ B(25, p)
H0: p = 0.30 H1: p < 0.30
Assume H0 so that X ∼ (25, 0.30).
H0 is rejected when X < c where
P(X < c) < 0.10.
From tables:
P(X < 4) = 0.0905 Use tables of B(25, 0.30).
P(X < 5) = 0.1935
So the critical region is X < 4.

158
Quality of tests

A
b Size = P(Type I error) The size is the actual significance level of
= P(X < 4| p = 0.30) the test. Use your calculator to find
= 0.0905 P(X < 4 | p = 0.30).

c If p = 0.20 then H0 is false.


Power = P(rejecting H0|H0 is false, i.e. p = 0.2)
Watch out Calculate the power directly.
= P(X < 4| p = 0.20) There is no need to calculate P(Type II error)
first. Remember to change the p-value in your
= 0.4207
calculator from 0.30 to 0.20.

Example 9
Jam is sold in jars. The amount of jam, in grams, in a jar is normally distributed with mean μ and
standard deviation 5. The manufacturer claims that μ is 106 and quality control officers will take
action against the manufacturer if μ < 106. A random sample of 30 jars is examined and a 5% level
of significance is used.
a Find the critical region for the sample mean using this test.
Given that in fact μ = 102,
b find the power of this test.

a H0: µ = 106 H1: µ < 106 State the hypotheses clearly.

n = 30 so X ∼ N(106, ___)
_ 52
30
_ Assuming H0 is true, state the distribution of the
Reject H0 when X < c statistic.
Critical region for z is Z < −1.6449
_
X − 106 Use tables to find the critical region for Z.
So ________
___
5
< −1.6449
___
√ 30
_
i.e. X < 104.498…
_
If µ = 102 then X ∼ N(102, ___).
_ 52
b Power = P(X < 104.498…|µ = 102)
30
= 0.9968 (4 d.p.)

Example 10
A particular mobile-phone provider fails to deliver text messages with probability p.
Brooke wants to investigate whether p > 0.02.
Using H0: p = 0.02 and H1: p > 0.02, Brooke notes the number of text messages she is able to send
successfully up until the first failure. If this value is less than or equal to 5 she rejects H0. If it is
more than 100 she accepts H0. If it is more than 5 but less than or equal to 100 she notes the
number of additional text messages she is able to send successfully up until the next failure.
She rejects H0 if this is less than or equal to 5 and accepts it otherwise.
a Find the size of this test.
b Calculate the power of this test when p = 0.015.

159
Chapter 8

A
a Let X = number of messages sent up to
and including first failure
Then X ∼ Geo( p) You need to calculate the probability that H0 is
Assume H0 is true, so that X ∼ Geo(0.02). rejected, assuming that it is true. You are given
P(X < 5) = 1 − (1 − 0.02)5 the critical region, so find P(H0 rejected | p = 0.02).
= 0.09607…
P(5 < X < 100) Problem-solving
= P(X < 100) − P(X < 5) You can use Geo(0.02) to model the number
= 1 − (1 − 0.02)100 − 0.09607… of text messages up to and including the first
= 0.77130… failure. A!er the first failure, the number of text
messages up to and including the next failure
P(H0 rejected| p = 0.02)
also has distribution Geo(0.02).
= P(X < 5) + P(5 < X < 100) × P(X < 5)
= 0.09607… + 0.77130… × 0.09607…
= 0.17018…
The size of the test is 0.1702 (4 d.p.).
The power of the test is
b Assume p = 0.015 so that Y ∼ Geo(0.015). P(H0 is rejected | p = 0.015). Repeat your
P(Y < 5) = 1 − (1 − 0.015)5 = 0.07278… calculation using a different assumed value of p.
P(5 < Y < 100)
= P(Y < 100) − P(Y < 5)
= 1 − (1 − 0.015)100 − 0.07278…
= 0.70660…
P(H0 rejected| p = 0.015)
= P(Y < 5) + P(5 < Y < 100) × P(Y < 5)
= 0.12421…
The power of the test when p = 0.015 is This is quite a small value for the power of the
0.1242 (4 d.p.). test. This suggests that the test is not very useful
when p = 0.15.

Exercise 8C

1 The random variable X ∼ N(μ, 32). A random sample of 25 observations of X is taken and the
sample mean x¯ is taken as the test statistic. It is desired to test H0 : μ = 20 against H1 : μ > 20
using a 5% significance level.
a Find the critical region for this test.
b Given that μ = 20.8, find the power of this test.

2 The random variable X has a binomial distribution. A sample of 20 is taken from it. It is
desired to test H0 : p = 0.35 against H1 : p > 0.35 using a 5% significance level.
a Calculate the size of this test.
b Given that p = 0.36, calculate the power of this test.

3 The random variable X has a Poisson distribution. A sample is taken and it is desired to test
H0 : λ = 4.5 against H1 : λ < 4.5 using a 5% significance level.
a Find the size of this test.
b Given that λ = 4.1, find the power of this test.

160
Quality of tests

A 4 A manufacturer claims that a particular rivet produced in his factory has a diameter of 2 mm,
E
and that the diameter is normally distributed with a variance of 0.004 mm2.
A random sample of 25 rivets is taken from a day’s production to test whether the mean
diameter had altered, up or down, from the stated figure. A 5% significance level is to be used
for this test.
If the mean diameter had in fact altered to 2.02 mm, calculate the power of this test. (5 marks)

E/P 5 In a binomial experiment consisting of 10 trials the random variable X represents the number
of successes, and p is the probability of a success.
In a test of H0 : p = 0.3 against H1 : p . 0.3, a critical region of x > 7 is used.
Find the power of this test when
a p = 0.4 (3 marks)
b p = 0.8. (3 marks)
c Comment on your results. (1 mark)

E 6 Explain briefly what you understand by


a a Type I error (1 mark)
b the size of a significance test. (1 mark)
A single observation is made on a random variable X, where X ∼ N(μ, 10).
The observation, x, is to be used to test H0 : μ = 20 against H1 : μ . 20. The critical region is
chosen to be x > 25.
c Find the size of the test. (2 marks)

7 The random variable X has a geometric distribution. It is desired to test H0: p = 0.01 against
H1: p > 0.01 using a 5% significance level.
a Find the critical region for this test.
b Given that p = 0.2, calculate the power of this test.

P 8 The random variable X has a geometric distribution. It is desired to test H0: p = 0.01 against
H1: p ≠ 0.01 using a 5% significance level.
a Find the critical region for this test.
b Given that p = 0.02, calculate the power of this test.

E/P 9 The wallpaper produced by a certain manufacturer has defects that occur randomly at a
constant rate of λ per roll. If λ is thought to be greater than 0.8 then action has to be taken.
Using H0: λ = 0.8 and H1: λ > 0.8, a quality control manager takes a sample of 10 rolls and
rejects H0 if there are 12 or more defects. If there are 9 or fewer defects then H0 is accepted.
If there are 10 or 11 defects, a second sample of 10 rolls is taken and H0 is rejected if there are
8 or more defects in this second sample, otherwise it is accepted.
a Find the size of this test. (4 marks)
b Find the power of this test when λ = 1. (3 marks)

161
Chapter 8

A 10 A sweet manufacturer makes boxes of jelly beans. The number of jelly beans in each box is
E/P assumed to be normally distributed with standard deviation 5.
A consumer group wants to test the manufacturer’s claim that the mean number of jelly beans
in each box is 80. The group takes repeated samples of 20 boxes and records the mean number
of jelly beans per box in each sample.
The random variable X represents the number of samples the group need to take before they
obtain a sample with a mean less than 79.
If X < 10 the group rejects the company’s claim.
a Find the size of this test. (5 marks)
b Given that the actual mean number of jelly beans in each box is 81, find the power
of this test. (5 marks)

Challenge
A jam factory has an automated system for sealing their jars, and the
expected probability of error when the machines are well calibrated is
8%. The jars are sealed and placed into boxes of 60. To see whether the
machine that sealed all the jars in a specific box needs recalibrating,
a series of tests is performed. The first box is inspected by taking a
sample of 20 jars and performing a test, with 5% significance level,
to see whether the probability of a defective seal is greater than 8%.
If the first box fails the test they conclude that the machine needs
recalibrating, but if it passes the test they move on to the second box,
and perform the same test. Once again, if the box fails the test they
conclude that the machine needs recalibrating, but if it passes the test
they move on to the next box, and so on until a box fails the test.
a What is the maximum number of boxes that can be inspected such
that the probability of a Type I error is smaller than 10%?
The factory decides to conclude that the machine does not need
recalibrating if the first four boxes all pass the test.
b Given that a!er the second box the machine is decalibrated, increasing
the probability of a defective seal to 20%, find the power of the test,
knowing that only 4 boxes were inspected. You may assume that the
probability of a defective seal for each jar in the first two boxes is 8%.

8.4 The power function


So far you have calculated the probability of a Type II error or the power only when you have been
given a particular value of the population parameter of interest. Population parameters are seldom
known, and if they were known there would be little point in doing the test anyway. Sometimes past
experience can give you some idea of likely values of the parameters but, in general, since you do not
know the value of the parameter, you cannot decide the power of the test concerned. It is, however,
possible in these cases to calculate the power as a function of the relevant parameter (which we shall
generalise as θ). Such a function is known as a power function.

162
Quality of tests

A ■ The power function of a test is the function of the parameter θ which gives the probability that
the test statistic will fall in the critical region of the test if θ is the true value of the parameter.
A power function enables you to calculate the power of the test for any given value of θ, and thus to
plot a graph of power against θ.

Example 11
Past experience has shown that the number of accidents that take place at a road junction has a
Poisson distribution with an average of 3.5 accidents per month. A trading estate is built along
one of the roads leading away from the junction and the local council is anxious that this may have
increased the accident rate. To see if the number of accidents had increased, a test was set up with
the null hypothesis H0 : λ = 3.5 and with the alternative hypothesis being accepted if the number of
accidents X within the first month after the alteration was > 7.
a Find the size of the test.
b Find the power function for the test and sketch the graph of the power function.

a Size of test = P(reject H0 when it is true) You can use conditional probability
= P(X > 7|X ∼ Po(3.5)) notation to write your assumptions
= 1 − 0.9347 = 0.0653 quickly.
b Power function = P(reject H0 when it is false)
= P(X > 7|X ∼ Po(λ))
Problem-solving
= 1 − P(X < 6|X ∼ Po(λ))
= 1 − e−λ(1 + λ + ___ + ___ + ___ + ____ + _____)
λ2 λ 3 λ4 λ5 λ6 You do not know the value of λ. Your
2 6 24 120 720 power function will be given in terms
This enables values of the power of the test to be of this unknown parameter.
calculated for different values of λ.
λ = 4 gives power = 0.1107
Use the finite sum of the
λ = 5 gives power = 0.2378
λ = 6 gives power = 0.3937 probabilities for X = 1, 2, 3, 4, 5, 6 to
λ = 7 gives power = 0.5503 find the power function.
λ = 8 gives power = 0.6866
λ = 9 gives power = 0.7932 O!en in an examination a partially
λ = 10 gives power = 0.8699
completed table will be given.
The graph is as shown below.
Power
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
3 4 5 6 7 8 9 10 λ

163
Chapter 8

A Power functions are particularly useful when comparing two different tests.
■ When comparing two tests of comparable size, you should recommend the test with the
higher power within the likely range of the parameter.

Example 12
A manufacturer of sweets supplies a mixed assortment of chocolates in a jar. He claims that 40%
of the chocolates have a ‘hard centre’, the remainder being ‘soft centred’.
A shopkeeper does not believe the manufacturer’s claim and proposes to test it using the following
hypotheses.
H0 : p = 0.4 H1 : p < 0.4
where p is the proportion of ‘hard centres’ in the jar. Two tests are proposed.
In test A he takes a random sample of 10 chocolates from the jar and rejects H0 if the number of
‘hard centres’ is less than 2.
a Find the size of test A.
b Show that the power function of test A is given by
(1 − p)10 + 10p(1 − p)9.
In test B he takes a random sample of 5 chocolates from the jar and if there are no ‘hard centres’
he rejects H0, otherwise he takes a second sample of 5 chocolates and H0 is rejected if there are no
further ‘hard centres’ on this second occasion.
c Find the size of test B.
d Find an expression for the power function of test B.
The powers for test A and test B for various values of p are given in the table.
p 0.1 0.2 0.25 0.3 0.35
Power for test A 0.74 r 0.24 s 0.09
Power for test B 0.83 0.54 0.42 0.31 0.22

e Calculate values for r and s.


f State, giving a reason, which of the two tests the shopkeeper should use.

a Size of test A = P(X < 2|X ∼ B(10, 0.4))


= 0.0464 (4 d.p.)
Write out the probabilities in terms
b Power of test A = P(X < 2|X ∼ B(10, p))
of p. This is already in the desired
= P(X = 0) + P(X = 1) form, so you don’t need to factorise,
= (1 − p)10 + 10p(1 − p)9 but you could also write this power
function as (1 − p)9(1 + 9p).
c Size of test B = P(reject H0 | p = 0.4)
= P(X = 0) + (1 − P(X = 0)) × P(X = 0)
= 0.65 + (1 − 0.6)5 × 0.65
= 0.0786

164
Quality of tests

A Problem-solving
d Power of test B = P(0 hard centres in first 5)
+ P(0 hard centres in second 5 and > 0 hard centres Consider the conditions that are
in first 5) necessary for H0 to be rejected:
= P(X = 0|p) + (1 − P(X = 0|p)) × P(X = 0|p) First sample Second sample
= (1 − p)5 + (1 − (1 − p)5)(1 − p)5 X = 0 and H0 rejected
= (1 − p)5 (1 + 1 − (1 − p)5)
= (1 − p)5 (2 − (1 − p)5) X = 0 and H0 rejected
= 2(1 − p)5 − (1 − p)10 X>0
X>0
e Test A: p = 0.2 Power = (1 − 0.2)10 + 10(0.2)(1 − 0.2)9
= 0.38
so r = 0.38
p = 0.3 Power = (1 − 0.3)10 + 10(0.3)(1 − 0.3)9
so s = 0.15
The reason for the final comment
f Power for test B > Power for test A for all the given should be based upon the
values of p, so he should use test B. calculations of the power.

Example 13
A local park believes the fox population in the area has decreased. They want to test for the
probability, p, that a fox will be observed on any given day. They count the number of days, X,
that pass until the first observation of a fox. They test H0: p = 0.1 against H1: p < 0.1 and reject H0
if X > 30.
a Find the size of this test.
b Find the power function for the test.

a Size of test = P(X > 30|X ∼ Geo(0.1))


If X ∼ Geo( p), then P(X > x) = (1 − p)x.
= (1 − 0.1)30
← Section 3.1
= 0.930 = 0.0424 (4 d.p.)
b Power function = P(X > 30|X ∼ Geo( p))
= (1 − p)30

Exercise 8D
E/P 1 A single observation x is taken from a Poisson distribution with parameter λ. This observation is
to be used to test H0: λ = 6.5 against H1: λ , 6.5. The critical region chosen was x < 2.
a Find the size of the test. (4 marks)
b Show that the power function of this test is given by
e−λ(1 + λ + _2 λ2)
1
(3 marks)
The table gives the value of the power function to two decimal places.
λ 1 2 3 4 5 6
Power 0.92 s 0.42 0.24 t 0.06

165
Chapter 8

A c Calculate values for s and t. (1 mark)


d Draw a graph of the power function. (1 mark)
e Find the values of λ for which the test is more likely than not to come to the correct
conclusion. (1 mark)

E/P 2 In a binomial experiment consisting of 12 trials, X represents the number of successes and p the
probability of a success.
In a test of H0: p = 0.45 against H1: p < 0.45 the null hypothesis is rejected if the number of
successes is 2 or less.
a Find the size of this test. (4 marks)
b Show that the power function for this test is given by
(1 − p)12 + 12p(1 − p)11 + 66p2(1 − p)10 (3 marks)
c Find the power of this test when p is 0.3. (1 mark)

3 In a binomial experiment consisting of 10 trials, the random variable X represents the number of
successes and p the probability of a success.
In a test of H0: p = 0.4 against H1: p . 0.4, a critical region of x > 8 was used.
Find the power of this test when:
a p = 0.5
b p = 0.8.
c Comment on your results.

E/P 4 A certain gambler always calls heads when a coin is spun. Before he uses a coin he tests it to see
whether or not it is fair and uses the following hypotheses:
H0: p = _2 H1: p < _2
1 1

where p is the probability that the coin lands heads on a particular spin. Two tests are proposed.
In test A the coin is spun 10 times and H0 is rejected if the number of heads is 2 or fewer.
a Find the size of test A. (4 marks)
b Explain why the power of test A is given by
(1 − p)10 + 10p(1 − p)9 + 45p2(1 − p)8 (3 marks)
In test B the coin is first spun 5 times. If no heads result, H0 is immediately rejected. Otherwise
the coin is spun a further 5 times and H0 is rejected if no heads appear on this second occasion.
c Find the size of test B. (4 marks)
d Find an expression for the power of test B in terms of p. (3 marks)
The power for test A and the power for test B are given in the table for various values of p.
p 0.1 0.2 0.25 0.3 0.35 0.4
Power for test A 0.9298 0.6778 0.3828 0.1673
Power for test B 0.8323 0.5480 0.4183 0.3079 0.2186 0.1495

e Find the power for test A when p is 0.25 and 0.35. (2 marks)
f Giving a reason, advise the gambler about which test he should use. (1 mark)

166
Quality of tests

A 5 In an experiment the probability of success in each trial is constant, and the random variable X
represents the number of trials needed to get one success. A test of H0: p = 0.15 against
H1: p < 0.15 with a 1% significance level is used.
a Find the size of the test.
b Find the power function.

E/P 6 A cyclist uses new tyres every time he does a time trial. He has found that on one specific route
he has a probability of 0.9 of not getting a flat tyre. After changing tyre brands he believes that
the new tyres are more resistant, and decides to perform a test, with 5% significance level, by
doing 10 trials on the route and seeing how many times he would complete it without a flat tyre.
a Find the size of the test. (4 marks)
b Find the power function of the test. (2 marks)
c Find the power function for the test if, instead of 10 trials, he had done 12 trials. (5 marks)
d Given that the probability of completing the trial without a flat tyre with the new brand is
0.95, calculate which number of trials gives a more accurate test result. (3 marks)

Mixed exercise 8

E 1 The random variable X is binomially distributed. A sample of 15 observations is taken and it is


desired to test H0: p = 0.35 against H1: p > 0.35 using a 5% significance level.
a Find the critical region for this test. (4 marks)
b State the probability of making a Type I error for this test. (2 marks)
The true value of p was found later to be 0.5.
c Calculate the power of this test. (2 marks)

E 2 The random variable X has a Poisson distribution. A sample is taken and it is desired to test
H0: λ = 3.5 against H1: λ < 3.5 using a 5% significance level.
a Find the critical region for this test. (4 marks)
b State the probability of committing a Type I error for this test. (2 marks)
Given that the true value of λ is 3.0,
c find the power of this test. (2 marks)

E 3 The random variable X ∼ N(μ, 9). A random sample of 18 observations is taken, and it is desired
to test H0: μ = 8 against H1: μ ≠ 8, at the 5% significance level. The test statistic to be used is
X−μ
Z = _____
σ___
__
n

a Find the critical region for this test. (4 marks)


b State the probability of a Type I error for this test. (2 marks)
Given that μ was later found to be 7,
c find the probability of making a Type II error. (2 marks)
d State the power of this test. (1 mark)

167
Chapter 8

A 4 A bird observatory wishes to test whether the migration rate of geese has changed from that of
E/P 10 per day. First they take note of how many geese are observed flying in a migratory pattern
on a specific day. If the number of geese migrating is greater than or equal to 4 and less than or
equal to 17, then they conclude that the rate has not changed. If they observe 3 or fewer geese,
then on the following day they conduct further observations, and if they observe 2 or fewer
geese they conclude that the rate has decreased, otherwise they conclude that it hasn’t changed.
If on the first day they observe 18 or more geese migrating, then on the following day they also
conduct further observations and if they observe 19 or more geese migrating they conclude that
the rate has increased, otherwise they conclude that it has not changed.
a Find the size of the test. (4 marks)
Given that the migration rate of the geese actually dropped to 5 per day,
b find the power of the test. (6 marks)

E 5 A single observation, x, is taken from a Poisson distribution with parameter λ. The observation
is used to test H0: λ = 4.5 against H1: λ . 4.5. The critical region chosen for this test was x > 8.
a Find the size of this test. (4 marks)
b The table gives the power of the test for different values of λ.
λ 1 2 3 4 5 6 7 8 9 10
Power 0 0.0011 0.0119 r 0.1334 s 0.4013 0.5470 t 0.7798

i Find values for r, s and t. (2 marks)


ii Using graph paper, plot the power function against λ. (2 marks)

E 6 In a binomial experiment consisting of 15 trials, X represents the number of successes and p the
probability of success.
In a test of H0: p = 0.45 against H1: p , 0.45 the critical region for the test was X < 3.
a Find the size of the test. (4 marks)
b Use the binomial cumulative distribution function to complete the table given below.
(3 marks)
p 0.1 0.2 0.3 0.4 0.5
Power 0.944 s 0.2969 t 0.0176

c Draw the graph of the power function for this test. (1 mark)

E 7 A company buys rope from Bindings Ltd and it is known that the number of faults per 100 m
of their rope follows a Poisson distribution with mean 2. The company is offered 100 m of rope
by Tieup, a newly established rope manufacturer. The company is concerned that the rope from
Tieup might be of poor quality.
a Write down the null and alternative hypotheses appropriate for testing that rope from
Tieup is in fact as reliable as that from Bindings Ltd. (1 mark)
b Derive a critical region to test your null hypothesis with a size of approximately 0.05.
(4 marks)
c Calculate the power of this test if rope from Tieup contains an average of 4 faults
per 100 m. (3 marks)

168
Quality of tests

A 8 The number of faulty garments produced per day by machinists in a clothing factory has
E/P
a Poisson distribution with mean 2. A new machinist is trained and the number of faulty
garments made in one day by the new machinist is counted.
a Write down the appropriate null and alternative hypotheses involved in testing the theory
that the new machinist is less reliable than the other machinists. (1 mark)
b Derive a critical region, of size approximately 0.05, to test the null hypothesis. (4 marks)
c Calculate the power of this test if the new machinist produces an average of 3 faulty
garments per day. (3 marks)
The number of faulty garments produced by the new machinist over three randomly selected
days is counted.
d Derive a critical region, of approximately the same size as in part b, to test the null
hypothesis. (2 marks)
e Calculate the power of this test if the machinist produces an average of 3 faulty garments
per day. (3 marks)
f Comment briefly on the difference between the two tests. (1 mark)

E/P 9 A single observation, x, is to be taken from a Poisson distribution with parameter μ.


This observation is to be used to test H0: μ = 6 against H1: μ , 6. The critical region is chosen
to be x < 2.
a Find the size of the critical region. (1 mark)
b Show that the power function for this test is given by
_1
2e
−μ (2 + 2μ + μ2) (4 marks)
The table gives the values of the power function to 2 decimal places.
µ 1.0 1.5 2.0 4.0 5.0 6.0 7.0
Power 0.92 0.81 s 0.24 t 0.06 0.03

c Calculate the values of s and t. (3 marks)


d Draw a graph of the power function. (2 marks)
e Estimate the range of values of μ for which the power of this test is greater than 0.8. (3 marks)

E/P 10 A proportion p of the items produced by a laboratory are defective. A technician selects a
random sample of 10 items from each batch produced to check whether or not there is evidence
that p > 0.10. The criterion that the technician uses for rejecting the hypothesis that p is 0.10 is
that there are more than 4 defective items in the sample.
a Find the size of the test. (2 marks)
The table gives some values, to 2 decimal places, of the power function of this test.
p 0.15 0.20 0.25 0.30 0.35 0.40
Power 0.01 0.03 u 0.15 0.25 0.37

b Find the value of u. (2 marks)


A supervisor checks the production by taking a random sample of 5 items from each batch
produced. The hypothesis that p = 0.10 is rejected if more than 2 defective items are found in
the sample.

169
Chapter 8

A c Find P(Type 1 error) using the supervisor’s test. (2 marks)


The table gives some values, to 2 decimal places, of the power function for the test in part c.
p 0.15 0.20 0.25 0.30 0.35 0.40
Power 0.03 0.06 0.10 0.16 v 0.32

d Find the value of v. (2 marks)


e Using the same axes, on graph paper draw the graphs of the power functions of these
two tests. (4 marks)
f i State the value of p where the graphs cross.
ii Explain the significance of p being greater than this value. (2 marks)
g Suggest two advantages of using the test with sample size 5. (2 marks)

E/P 11 Accidents on a stretch of motorway occur at an average rate of λ per week. A road safety
officer takes a random sample of 10 weeks to test whether or not there is evidence that λ > 0.3.
The criterion that the officer uses for rejecting the hypothesis that λ = 0.3 is that there are more
than 5 accidents in the sample.
a Find the size of the test. (2 marks)
The table gives some values, to 2 decimal places, of the power function of this test.
λ 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Power 0.21 a 0.55 0.70 0.81 0.88 0.93

b Find the value of a. (2 marks)


The road safety manager would like to design a test of whether or not λ > 0.3, using a larger
sample. The manager chooses a random sample of 15 weeks and requires the probability of a
Type I error to be less than 5%.
c Find the criterion to reject the hypothesis that λ = 0.3 which makes the test as powerful
as possible. (2 marks)
d Hence state the size of this second test. (1 mark)
The table gives some values, to 2 decimal places, of the power function for the test in part c.
λ 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Power 0.15 0.34 0.54 0.72 0.85 b 0.96

e Find the value of b. (2 marks)


f Using the same axes, on graph paper draw the graphs of the power functions of these
two tests. (4 marks)
g i State the value of λ where the graphs cross.
ii Explain the significance of λ being greater than this value. (2 marks)

170
Quality of tests

Challenge
A Jane and Emma decide to test a pair of dice from a new board game.
They suspect that at least one of them has probability higher than _16 of
showing the value one.
Jane decides to throw both dice 12 times. If a pair of ones appears 2 or
more times, she concludes that at least one of the dice is biased.
a Find the size of Jane’s test.
b Express the power of Jane’s test in terms of the parameter p, which
represents the probability of obtaining a pair of ones.
Emma decides to throw one dice 6 times. If the value one appears 4 or
more times she concludes that the dice is biased. If it appears fewer
than 4 times, then she throws the other dice 6 times, and concludes that
the second dice is biased if the value one appears 4 or more times.
c Find the size of Emma’s test.
Now assume that one of the dice is fair, and let q be the probability of
obtaining the value one on the other dice.
d Show that the power of Jane’s test is given by the expression

1 − (1 − __) − 2q (1 − __)
q 12 q 11
6 6
e Show that the power of Emma’s test is given by the expression
0.0087 + 14.8695q4 − 23.7912q5 + 9.913q6
Below is a graph of the power function for Emma’s test.

0.2
0.15
Power

0.1
0.05

–0.05O 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 q

f By using a table of values, draw the graph of the power function for
Jane’s test.
g Given that the parameter q lies between 0.1 and 0.4, explain, giving
your reasons, which test you would recommend.

171
Chapter 8

Summary of key points


A
1 A Type I error is when you reject H0, but H0 is in fact true. The probability of a Type I error is
the same as the actual significance level of the hypothesis test.

2 A Type II error is when you accept H0, but H0 is in fact false.

3 When a continuous distribution such as the normal distribution is used then P(Type I error) is
equal to the significance level of the test.

4 The size of a test is the probability of rejecting the null hypothesis when it is in fact true and
this is equal to the probability of a Type I error.

5 The power of a test is the probability of rejecting the null hypothesis when it is not true.
Power = 1 − P(Type II error) = P(being in the critical region when H0 is false)

6 The power function of a test is the function of the parameter θ which gives the probability
that the test statistic will fall in the critical region of the test if θ is the true value of the
parameter.

7 When comparing two tests of comparable size, you should recommend the test with the higher
power within the likely range of the parameter.

172

You might also like