Wilcoxon_rank-sum_test_LESSON
Wilcoxon_rank-sum_test_LESSON
com
Couple A B C D E F G H I
Use these data to test the hypothesis that most men are older than their wives using
(a a sign test
(b a Wilcoxon signed-rank test
Carry out your tests at the 5 % signi cance level
(c What assumptions, if any, are required for each of the tests
(d Explain how it is possible for the two tests to give different results
(b The position of the letters is given a value so that if a letter appears in the rst
position, it has a value of 1. If it appears in the second position, it has a value of 2
etc. For example, the total for the F’s in the arrangement FSSFSFSS is
1 + 4 + 6 = 11
Calculate the probability, that the total of the three F’s will be
(i 6
(ii 8
(iii less than or equal to 9
Give your answers as fractions and percentages to 4 s.f.
Note
What happens when the data is not paired i.e. there are more more values in one sample than the
other? We can still compare the medians of the samples using the Wilcoxon rank-sum test
The Wilcoxon rank-sum test tests whether two samples come from populations with identical
distributions. It does not require any assumptions about the two populations and, importantly,
their medians do not even need to be known
Page 1 of 4
.
.
s
)
)
)
)
)
)
)
)
)
)
,
fi
e
fi
:
www.mathspanda.com
E.g. 1 Over eight weeks, Jack visits his local supermarket on a Friday or Saturday and times how
long, to the nearest minute, it takes to his shopping. The data is below
Friday 38 56 60
Saturday 74 58 61 50 64
Rank 1 2 3 4 5 6 7 8
Day
It would be a pain to have to do all the arrangement calculations each time for a Wilcoxon rank-
sum test, so it is no surprise to hear there are table of critical values (see p206 of textbook)
The tables contain the largest value which would lead to the rejection of the null hypothesis
From tables, the critical value for two samples whose sizes are 3 and 5 is 7 i.e. the value found in
question 2(c) of the starter
Working: (a Rm = 2 + 3 = 5
The original ranks are −, r1, r2, −, −, −,
The reversed ranks would be −, −, −, r2, r1, −
r′1 = 5 and r′2 = 4 ⇒ R′m = 5 + 4 = 9
Page 2 of 4







)
)
)
)
)
)
)
.
)
.
fi
.
www.mathspanda.com
Success Criteria for carrying out a Wilcoxon rank-sum tes
Let the sample sizes of the two distributions be m and n where m ≤ n
1. Merge the data sets and rank all the data, smallest to largest (smallest value is rank 1)
2. Add the ranks of the smallest sample, Rm
3. Choose the test statistic W as the smaller value of Rm and m(m + n + 1) − Rm
4. If W is less than the critical value given by tables, reject H0
m(m + n + 1) − Rm would be the value of Rm if we ranked the data values from largest to
smallest and solves the problem of opposite ranking and whether H1 is “greater than” or “less
than”
E.g. 3 State the critical values for the following Wilcoxon rank-sum tests using the table on p206
(a a one-tailed test at the 5 % signi cance level where m = 6, n = 10
(b a two-tailed test at the 2 % signi cance level where m = 3, n = 9
(c a one-tailed test at the 2.5 % signi cance level where m = 4, n = 8
(d a two-tailed test at the 10 % signi cance level where m = 5, n = 7
Working: (a 35
E.g. 4 The lengths of the femur, in mm, in samples of a mouse from Britain and North Africa are
given below
Conduct a non-parametric test at the 5 % level to test whether the data are consistent with
the assumption that the mice in Britain and North Africa are the same breed
Working: H0 : mice in Britain and North Africa are the same bree
H1 : mice in Britain and North Africa are not the same bree
Here are the lengths ranked in order from smallest to largest
Value 9.8 10.0 10.6 10.8 11.1 11.3 11.5 11.8 12.3 12.4 12.7 13.1 13.2
Rank 1 2 3 4 5 6 7 8 9 10 11 12 13
Area NA NA NA B NA B NA B B B B B B
RNA = 18
m = 5, n = 8 m(m + n + 1) − Rm = 5(5 + 8 + 1) − 18 = 52
∴ W = 18
The 5 % critical value for a two-tailed test when m = 5, n = 8 is 21
Since W = 18 ≤ 21, we reject H0
There is evidence to suggest the mice in Britain and North Africa are not the
same breed
Page 3 of 4
.
)
)
)
)
:
)
fi
.
:
.
fi
fi
fi
fi
.
www.mathspanda.com
E.g. 5 An estate manager is wondering which trees to plant in a forest. She collects data from
nearby forests on the heights of Stardust and Blue Gown trees after 10 years
Test at the 2.5 % level that the average height of Blue Gown trees after ten years is higher
than the Stardust variety
E.g. 6 A study was undertaken to investigate the effect of vitamin C on the common cold. Fifteen
students, each of whom had developed the symptoms of a common cold, were randomly
assigned to two groups. Group A acted as the control group and received unknowingly only
a daily sugar tablet, whereas Group B received one gram of vitamin C per day. The table
shows the duration, in days, of cold symptoms for each student
Group A 13 11 12 9 18 7 12
Group B 8 14 7 10 9 12
Test at the 1 % level the suggestion that consumption of one gram of vitamin C each day
improves the time to recover from a common col
Exercis
p56 4D Qu 1-4, (5 red
Summar
H0 : The two distributions are the sam
H1 : The two distributions are differen
Success Criteria for carrying out a Wilcoxon rank-sum test
Let the sample sizes of the two distributions be m and n where m ≤ n
1. Merge the data sets and rank all the data, smallest to largest (smallest value is rank 1)
2. Add the ranks of the smallest sample, Rm
3. Choose the test statistic W as the smaller value of Rm and m(m + n + 1) − Rm
4. If W is less than the critical value given by tables, reject H0
m(m + n + 1) − Rm would be the value of Rm if we ranked the data values from largest to
smallest and solves the problem of opposite ranking and whether H1 is “greater than” or “less
than”
Page 4 of 4
.
:
e
:
.
fi
t