0% found this document useful (0 votes)
31 views

Session 8 DEN1015H 2012 Lecture Notes & Review Problems With Solutions

The document discusses non-parametric tests which make fewer assumptions than other tests. It describes several non-parametric tests for paired data including the Sign test and Wilcoxon signed-rank test. An example is provided to illustrate how to perform and interpret the Sign test.

Uploaded by

Jeff Chadwick
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Session 8 DEN1015H 2012 Lecture Notes & Review Problems With Solutions

The document discusses non-parametric tests which make fewer assumptions than other tests. It describes several non-parametric tests for paired data including the Sign test and Wilcoxon signed-rank test. An example is provided to illustrate how to perform and interpret the Sign test.

Uploaded by

Jeff Chadwick
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

DEN 1015H LECTURE NOTES Session 8

NON-PARAMETRIC (DISTRIBUTION-FREE) TESTS

What are non-parametric tests, and when are they useful?


The advantage of non-parametric tests is that they require fewer assumptions about the
populations from which the samples are selected than other tests. For example, to use analysis
of variance, you have to assume that each group is an independent random sample from a
normal population, and that their variances are equal. You know that many of the procedures
work reasonably well even when the assumptions are not completely met. That is, the tests are
robust. However, when we analyze data from small samples, we will encounter situations in
which there are serious departures from the necessary assumptions. In such situations, we
need procedures that require less stringent assumptions about the data, for example the
normality of the underlying distribution. Collectively, these procedures are called
distribution-free or non-parametric tests.
The disadvantage is that non-parametric tests are less powerful than other tests, meaning they
are not as good at finding true differences when they exist in the population as the tests based
on the assumption of normality.
Most non-parametric tests are based on ranking observations in order of magnitude and testing
these rankings rather than the actual values of the observations. Non-parametric tests can be
used when:
i) It is obvious that the data are not normally distributed. This applies particularly to
small samples of quantitative data, which cannot be corrected by a suitable
transformation.
ii) The sample sizes are too small to test the data for normality.
iii) If the data have outliers, since the outlying cases will be ranked and therefore they
would not influence the results as much as they would if we had used a test based on
means.
iv) The measurements consist of ordered categories such as ranks.
v) Ordinal data: If it is not meaningful to compare means, then the use of non-parametric
methods can be used to compare medians instead.
vi) A quick analysis involving few calculations is warranted.
Strictly speaking, the 2 methods (e.g., the Fisher’s Exact test, the McNemar test) described in
a previous class are also non-parametric, but as they are the standard methods for the analysis
of proportions and contingency tables they are generally included among the traditional
methods.
The Sign test is a less powerful alternative to the Paired t test.
The Wilcoxon matched-pairs signed-rank test is used to test the same null hypothesis as the
Sign test. It is usually more powerful than the Sign test.
The Mann-Whitney U test (& the Wilcoxon sum-rank test) is used to test the hypothesis that
two independent groups come from populations with the same distribution. It is an alternative
to the two independent-samples t test.
The Kruskal-Wallis test is a non-parametric alternative to one-way analysis of variance.
There are a large variety of these tests, some of which are summarized in Table 1.

1
Prepared by Dr. Herenia P. Lawrence
Table 1. Summary of the main non-parametric methods.

Non-parametric method Use Parametric equivalent


Sign test Simplified form of Wilcoxon
signed rank test

Wilcoxon signed rank test Test of difference between Paired t test


paired observations

Wilcoxon rank sum test Comparison of two groups Two-sample t test

Mann-Whitney U test Alternatives to the Wilcoxon rank sum test Two-sample t test
Kendall’s S test which give identical results Two-sample t test

Kruskal-Wallis one-way Comparison of several groups One-way analysis of


analysis of variance variance

Friedman two-way Comparison of groups defined by Two-way analysis of


analysis of variance their values on two variables variance

Spearman’s rank Measure of association between Pearson’s correlation


correlation two scale variables

Kendall’s rank correlation Alternative to Spearman’s rank Pearson’s correlation


correlation

2
goodness of fit test Comparison of an observed frequency
distribution with a theoretical one

Kolmogorov-Smirnov tests
One-sample Alternative to 2 goodness of fit test
Two-sample Comparison of two frequency distributions

2
Prepared by Dr. Herenia P. Lawrence
NON-PARAMETRIC TESTS FOR PAIRED DATA

The Sign Test


This test is used to analyze paired data. The null hypothesis for the sign test is that the median
difference between the two members of a pair is 0 (zero). We do not have to make assumptions
about the shapes of the distributions from which the data are obtained. The only requirement is
that the different pairs of observations are selected independently and the values can be ordered
from smallest to largest. That is because the test is based on seeing which of a pair of values is
larger.

Example: A study was designed to evaluate a new pain relieving medication used in endodontic
therapy. The study was based on 9 patients’ reports regarding the degree of pain or discomfort
(scale ranging from 0 to 3, with 0 = no pain and 3 = severe pain) during treatment using the new
medication compared with the discomfort during a separate, similar occasion using the regular
control medication. The results are as follows:

Patient ID Medication Direction of Diff. Sign


Control Test
1 2 0 > +
2 3 0 > +
3 2 1 > +
4 3 3 = 0
5 2 0 > +
6 1 0 > +
7 0 1 < -
8 3 2 > +
9 0 1 < -
Source: Weintraub JA, Douglass CW, Gillings DB. Biostats. Data analysis for dental health care
professionals (2nd ed.). Research Triangle Park, NC: CAVCO Inc., 1994.

N=8, because the one tied pair is ignored. Six patients felt greater pain with the control
medication as compared with the test medication (T=6). The null hypothesis is that control and
test medications are equally effective. If there is no effect of medication, you would expect that,
on average, half the signs would be + and half -. The statistical question is, “What is the
likelihood that a 2/6 split of pluses and minuses could occur by chance?”

To determine whether or not the observed number of positive differences is statistically


significant, the number T=6 is compared to the following table of critical values which is derived
from a binomial distribution. This table presents the number which would be considered
significant (critical value) for each given sample size (N) and value.

3
Prepared by Dr. Herenia P. Lawrence
CRITICAL VALUES OF T FOR THE EXACT SIGN TEST
(two-sided) 0.10 0.05 (two-sided) 0.10 0.05
N (one-sided) 0.05 0.025 N (one-sided) 0.05 0.025
1 11 2, 9 1, 10
2 12 2, 10 2, 10
3 13 3, 10 2, 11
4 14 3, 11 2, 12
5 0, 5 15 3, 12 3, 12

6 0, 6 0, 6 16 4, 12 3, 13
7 0, 7 0, 7 17 4, 13 4, 13
8 1, 7 0, 8 18 5, 13 4, 14
9 1, 8 1, 8 19 5, 14 4, 15
10 1, 9 1, 9 20 5, 15 5, 15
Source: Weintraub JA, Douglass CW, Gillings DB. Biostats. Data analysis for dental health care
professionals (2nd ed.). Research Triangle Park, NC: CAVCO Inc., 1994.

For N=8 there must be 7 positive differences (or conversely 1 negative difference) to obtain one-
sided significance at the 0.05 level. To obtain two-sided significance at the 0.05 level all eight
differences would have to be positive. Thus the conclusion is not to reject the null hypothesis that
the test medication and the control medication have equal probability of alleviating discomfort.

Wilcoxon Signed-rank Test


This is the non-parametric equivalent to the paired t test. It uses the signs and relative magnitudes
of the data but not their actual values. However, it is more powerful than the Sign test, because
the Sign test just looks at which of the two numbers of a pair is larger, ignoring the magnitude of
the difference.

The Wilcoxon Signed-rank test consists of four basic steps:

1. Exclude any differences that are zero. Put the remaining differences in ascending order of
magnitude, ignoring their signs and give them ranks 1, 2, 3, etc. If any differences are equal,
average their ranks.

2. Count up the ranks of the positive differences and of the negative differences and denote these
sums by T+ and T-, respectively.

3. If there were no differences in the two groups then the sums T+ and T- would be similar. If
there were a difference then one sum would be much smaller and one sum would be much
larger than expected. Denote the smaller sum by T.

T = smaller of T+ and T-

4
Prepared by Dr. Herenia P. Lawrence
4. The Wilcoxon Signed-rank test is based on assessing whether T, the smaller of the observed
sums, T+ and T-, is smaller than could be expected by chance. Compare the value obtained with
the critical values for the 5%, 2% and 1% significance levels given in Table A7 (see attached).
Note that the appropriate sample size, N, is the number of differences that were ranked rather
than the total number of differences, and does not therefore include the zero differences.
In contrast to the usual situation, a result is significant if T is smaller than the critical value
shown. Thus, if T is less than the critical value, then reject the null hypothesis.
In the example below, the sample size is 9 and the 5%, 2%, and 1% two-sided percentage
points are 6, 3, and 2 respectively. The result is therefore significant at the 5% level, since 5.5
is smaller than 6, indicating that the fluoride toothpaste was more effective than the non-
fluoride toothpaste.

Example: In a trial to compare a fluoride and a non-fluoride toothpaste, pairs of children were
matched for sex, age, social class and numbers of surfaces exposed. One child in each pair was
given each type of paste. Five years later the children were examined and the numbers of decayed
surfaces were counted:

Number of decayed surfaces


Pair Non-fluoride Fluoride Difference Rank
1 9 6 3 4
2 10 7 3 4
3 4 3 1 1.5
4 19 19 0
5 13 4 9 9
6 12 12 0
7 8 2 6 7
8 0 0 0
9 13 16 -3 4
10 6 7 -1 1.5
11 12 5 7 8
12 5 0 5 6
13 7 7 0

Sum of ranks of positive differences N=9


= 4 + 4 + 1.5 + 9 + 7 + 8 + 6 = 39.5

Sum of ranks of negative differences


= 4 + 1.5 = 5.5

The smaller value is the sum of the ranks of the negative differences, 5.5. Looking in the table of
critical values for the Wilcoxon Signed-rank test (Table A7 attached), 5.5 is less than the value 6
for 9 pairs (N), so p<0.05 (5%) for a two-tailed test. This means that there is no reason to accept
the null hypothesis and we conclude that the non-fluoride toothpaste group has considerably more
decay than the fluoride matched-pair group. How does this result compare with that obtained
using a paired t test?

5
Prepared by Dr. Herenia P. Lawrence
NON-PARAMETRIC TESTS FOR UNPAIRED DATA

The Wilcoxon Rank-sum Test


This test is used to compare two unmatched samples of data. This is one of the non-parametric
equivalents of the two independent-samples t test.

The Wilcoxon Rank-sum test consists of three basic steps:

1. Rank the observations from both groups together in ascending order of magnitude, as shown in
the example below. If any of the values are equal, average their ranks.

2. Add up the ranks in the group with the smaller sample size. In this case it is the ‘patients’, and
their ranks add up to 116.5. If the two groups are of the same size either one may be picked.

T = sum of ranks in group with smaller sample size

3. Compare this sum with the critical ranges given in Table A8 (see attached), which is arranged
somewhat differently to the tables for other significant tests. Look up the row corresponding to
the sample sizes of the two groups, in this case row 9, 11. The range shown for the 5%
significance level is 68 to 121 and corresponds to non-significant values. In other words, sums
of 68 and below or 121 and above are significant at the 5% level.

Example: To compare systolic blood pressures taken from 9 ‘patients’ and 11 ‘normal’ people.
The null hypothesis is that there is no difference in systolic blood pressure of the ‘patients’ and
‘normal’ people.

‘Patients’ ‘Normal’
132 139
160 107
145 98
114 140
125 115
128 136
154 123
134 129
123 126
110
105

6
Prepared by Dr. Herenia P. Lawrence
To do the test, list all the observations in ascending order:

Value Rank Value Rank Value Rank


98 1 123 7.5 136 15
105 2 125 9 139 16
107 3 126 10 140 17
110 4 128 11 145 18
114 5 129 12 154 19
115 6 132 13 160 20
123 7.5 134 14

Tied 7th and 8th and so the ranks are averaged. That is, 123 appears twice (ranks 7 and 8) and is
ranked = (7 + 8)/2 = 7.5. If there were three ties for 123, their ranks would be equal to (7+8+9)/3
= 8.

Sum of ranks for ‘normal’ people


= 1 + 2 + 3 + 4 + 6 + 7.5 + 10 + 12 + 15 + 16 + 17
= 93.5

Sum of ranks for ‘patients’


= 5 + 7.5 + 9 + 11 + 13 + 14 + 18 + 19 + 20
= 116.5

Look at the table for the critical values for the Wilcoxon rank sum test for the figures
corresponding to 9 and 11 observations. In the smaller (n=9) of the two groups, we would not
expect to see a value below 68 or above 121 on more than 0.05 of occasions. Since 68 < 116.5 <
121, this means that under the null hypothesis of no difference, this difference could have
occurred by chance (or we could have found a significant difference if we had more people in the
sample). So there is no reason to reject the null hypothesis. If rank sum = 132, this difference in
ranks would be extremely unlikely to have occurred by chance.

Mann-Whitney U Test
The Mann-Whitney test is another commonly used alternative to the two independent-samples t
test. It gives identical results to the Wilcoxon rank-sum test. For the actual computation of the
test, you rank the combined data values for the two groups. Then you find the average rank in
each group. The statistic U is more complicated than T (due to Wilcoxon), being calculated as

U = n1n2 + ½n1(n1+1) - T

7
Prepared by Dr. Herenia P. Lawrence
The Kruskal-Wallis Test
The Kruskal-Wallis test is a non-parametric alternative to one-way analysis of variance (One-way
ANOVA). It is computed exactly like the Mann-Whitney U test, except that there are more
groups. The null hypothesis for the test is that all population means are equal. This implies that
the population variances for the groups must be the same.

Example: Consider the relationship between cigarette smoking and education. The table below
shows the average ranks for the number of cigarettes smoked for three groups of people: grammar
school education only, high school only, and some college. You see that the mean ranks are
similar for the three groups. The observed significance level is large (0.78), so you do not reject
the null hypothesis that the distributions are the same for the three groups.

Kruskal-Wallis test for cigarette consumption by education

Ranks

Highest Level of
Schooling N Mean Rank
No of Cigarettes per Day Grammar School 32 101.44
in 1958 High School 112 108.57
College 67 103.89
Total 211

Test Statisticsa,b

No of
Cigarettes
per Day in
1958
Chi-Square .505
df 2
Asymp. Sig. .777
a. Kruskal Wallis Test
b. Grouping Variable: Highest Level of Schooling

What do I do if I am not sure if I should be using a non-parametric test or a parametric test?


When in doubt, do them both! If you reach the same conclusions based on both types of tests,
there is nothing to worry about. If the results from the non-parametric test are not significant
while those from the parametric test are, try to figure out why:
Do you have one or two data values that are much smaller or larger than the rest? If so, they
may be affecting the mean and having a large impact on your conclusions.
If the problem is with the non-normal distribution of data values, see if you can transform the
data to better conform to the parametric assumptions. If your transformation is successful, you
can use one of the more powerful parametric procedures for your analysis.

8
Prepared by Dr. Herenia P. Lawrence
DEN 1015H Review Problems Session 8

1. Twenty subjects each receive two restorations using two different materials. After 3 months,
the cold sensitivity of the two teeth is compared by asking each subject to indicate which tooth
causes more pain when exposed to cold water. Based on observing that 18 of the 20 subjects
reported more sensitivity with a particular one of the materials, a highly significant difference is
declared. This is an example of the use of which statistical test?

2. The following data show the bacteroides levels (possible values: 1, 13, 14, 15, 16, 17) in nine
healthy subjects and fifteen with gingivitis.

Healthy: 15, 1, 1, 1, 1, 1, 14, 1, 1


Gingivitis: 1, 17, 1, 1, 14, 14, 15, 15, 15, 14, 14, 13, 14, 1, 1

Use the Wilcoxon’s rank-sum test to compare the levels of bacteroides of the two groups.
Compare your conclusion with that reached using the two independent-samples t test.

3. The following observations are of anxiety scores recorded on ten patients receiving midazolam
premedication before dental surgery under general anesthesia and no premedication in random
order. The two responses are recorded on the same person, meaning that the person is his/her own
control. Use the Wilcoxon’s signed-rank test to evaluate if there is any difference in the effect of
the premedication on the anxiety scores. Compare your conclusion with that reached using the
matched-pairs t test.

Patient Drug Placebo Difference


1 19 22 -3
2 11 18 -7
3 14 17 -3
4 17 19 -2
5 23 22 1
6 11 12 -1
7 15 14 1
8 19 11 8
9 11 19 -8
10 8 7 1

4. Perform a Kruskal-Wallis test on the data from question 4 of the review problems provided to
you during Session 7, in which an anthropologist measured mean upper central incisor width
(mm) in individuals from four different aboriginal groups. Do these data suggest that incisor
width may vary significantly between these groups? How does the p-value from the Kruskal-
Wallis test compare with the p-value from the One-way ANOVA test? I would like you to enter
the data in SPSS (or other statistical package) in order to run the analysis. Then cut and paste
your output into Word, with your conclusion, to hand in at the next class.

9
Prepared by Dr. Herenia P. Lawrence
Enter the data using the following format:

subject group width


1 1 8.0
2 1 7.5
3 1 8.2
4 1 7.5
5 1 7.3
6 2 8.3
7 2 6.8
8 2 7.2
9 2 6.7
10 3 8.5
11 3 8.3
12 3 7.9
13 3 8.2
14 3 8.4
15 4 7.3
16 4 7.2
17 4 6.8
18 4 6.7

To run the Kruskal-Wallis test in SPSS, from the menus choose:

Analyze
Nonparametric Tests
K Independent Samples...

The Tests for Several Independent Samples dialog box will appear.

Select width and move it into the Test Variable List.


Select Aborig and move it into the Grouping Variable box. Click Define Range... and
indicate the range of group codes (i.e., 1 - 4). Click Continue.
Select the Kruskal-Wallis H test in the Test Type group and click OK.

10
Prepared by Dr. Herenia P. Lawrence
DEN 1015H Solutions to Review Problems Session 8

1. Twenty subjects each receive two restorations using two different materials. After 3 months,
the cold sensitivity of the two teeth is compared by asking each subject to indicate which tooth
causes more pain when exposed to cold water. Based on observing that 18 of the 20 subjects
reported more sensitivity with a particular one of the materials, a highly significant difference is
declared. This is an example of the use of which statistical test?

The Sign test

2. The following data show the bacteroides levels (possible values: 1, 13, 14, 15, 16, 17) in nine
healthy subjects and fifteen with gingivitis.

Healthy: 15, 1, 1, 1, 1, 1, 14, 1, 1


Gingivitis: 1, 17, 1, 1, 14, 14, 15, 15, 15, 14, 14, 13, 14, 1, 1

Use the Wilcoxon’s rank-sum test to compare the levels of bacteroides of the two groups.
Compare your conclusion with that reached using the two independent-samples t test.

Report

Levels of Bacteroides
Std.
Group N Mean Deviation Minimum Maximum
Healthy 9 4.00 5.96 1 15
Gingivitis 15 10.00 6.64 1 17
Total 24 7.75 6.93 1 17

The Wilcoxon Rank Sum (Mann-Whitney) Test

Ranks

Mean Sum of
Group N Rank Ranks
Levels of Bacteroides Healthy 9 9.28 83.50
Gingivitis 15 14.43 216.50
Total 24

11
Prepared by Dr. Herenia P. Lawrence
Test Statisticsb

Levels of
Bacteroid
es
Mann-Whitney U 38.500
Wilcoxon W 83.500
Z -1.869
Asymp. Sig. (2-tailed) .062
Exact Sig. [2*(1-tailed a
.084
Sig.)]
a. Not corrected for ties.
b. Grouping Variable: Group

If you were to calculate the results by hand...

Combine the data from both groups and arrange the values in increasing order and assign ranks....

The data values replaced by their ranks are

Healthy: 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 16.5, 21.5
Gingivitis: 6.5, 6.5, 6.5, 6.5, 6.5, 13, 16.5, 16.5, 16.5, 16.5, 16.5, 21.5, 21.5, 21.5, 24

The rank-sum statistic, T or W = sum of the ranks in group with smaller sample size

So for these data, the rank-sum statistic is W = 83.5. Using the table for the critical ranges for the
Wilcoxon rank sum test, with 9 and 15 observations, the probability that W = 83.5 lies outside the
range 79, 146 is 0.05 (two-sided P value). Since 79 < 83.5 < 146, we do not reject the null
hypothesis at significance level 0.05. Using the independent-samples t test, we would reject the
null hypothesis at significance level 0.05 (see below).

Independent Samples Test

Levels of Bacteroides
Equal variances Equal variances
assumed not assumed
Levene's Test for F 1.325
Equality of Variances Sig. .262
t-test for Equality of t -2.222 -2.286
Means df 22 18.504
Sig. (2-tailed)
.037 .034
Mean Difference
-6.00 -6.00

Std. Error Difference


2.70 2.62
95% Confidence Interval Lower -11.60 -11.50
of the Difference Upper -.40 -.50

12
Prepared by Dr. Herenia P. Lawrence
3. The following observations are of anxiety scores recorded on ten patients receiving midazolam
premedication before dental surgery under general anesthesia and no premedication in random
order. The two responses are recorded on the same person, meaning that the person is his/her own
control. Use the Wilcoxon’s signed-rank test to evaluate if there is any difference in the effect of
the premedication on the anxiety scores. Compare your conclusion with that reached using the
matched-pairs t test.

The null hypothesis is that there is no difference in anxiety scores between drug and placebo.

Rank the differences, regardless of the sign.

Patient Drug Placebo Difference Rank


1 19 22 -3 6.5
2 11 18 -7 8.0
3 14 17 -3 6.5
4 17 19 -2 5.0
5 23 22 1 2.5
6 11 12 -1 2.5
7 15 14 1 2.5
8 19 11 8 9.5
9 11 19 -8 9.5
10 8 7 1 2.5
Total -13

Sum of ranks of positive differences = 2.5 + 2.5 + 9.5 + 2.5 = 17

Sum of ranks of negative differences = 6.5 + 8.0 + 6.5 + 5.0 + 2.5 + 9.5 = 38

T = smaller of T+ and T-

Looking at the table for the critical values for the Wilcoxon signed-rank test, the smaller value, 17,
is bigger than any of the values for 10 pairs of observations. There is, therefore, no reason to
reject the null hypothesis of no difference between the drug and the placebo. Similar conclusions
are reached using the paired t test, as shown below:

Mean d = d/n = -13/10 = -1.3


Standard deviation s = [ (d - d)2/(n-1)] = 20.68 = 4.55
Standard error SE(d) = s/ n = 4.55/ 10 = 1.44
To test the null hypothesis that there is no difference between the pairs, calculate

t=d-0
SE(d)

t= d
s/ n

13
Prepared by Dr. Herenia P. Lawrence
t = -1.3/1.44 = -0.90

Looking at the t table, the value corresponding to 9 (n-1) degrees of freedom and a probability of
0.05 is t(0.05, 9) = 2.26. For a two-tailed test, P > 0.05 (or P = 0.39), so we do not reject the null
hypothesis and conclude that there is not a statistically significant difference in anxiety scores due
to the effect of drug.

SPSS OUTPUT for the Wilcoxon Signed Ranks test and the Paired t test

Wilcoxon Signed Ranks Test

Ranks

Mean Sum of
N Rank Ranks
Placebo - Drug Negative Ranks 4a 4.25 17.00
Positive Ranks 6b 6.33 38.00
Ties 0c
Total 10
a. Placebo < Drug
b. Placebo > Drug
c. Drug = Placebo

Test Statisticsb

Placebo -
Drug
Z -1.079a
Asymp. Sig. (2-tailed) .281
a. Based on negative ranks.
b. Wilcoxon Signed Ranks Test

T-Test

Paired Samples Statistics

Std. Std. Error


Mean N Deviation Mean
Pair Drug 14.80 10 4.69 1.48
1 Placebo 16.10 10 4.95 1.57

14
Prepared by Dr. Herenia P. Lawrence
Paired Samples Test

Pair 1
Drug - Placebo
Paired Differences Mean -1.30
Std. Deviation
4.55
Std. Error Mean
1.44

95% Confidence Interval Lower -4.55


of the Difference Upper 1.95
t -.904
df 9
Sig. (2-tailed)
.390

4. Kruskal-Wallis Test
Ranks
Test Statisticsa,b
Mean
WIDTH
GROUP N Rank
Chi-Square 10.295
WIDTH 1 5 10.40
df 3
2 4 6.50
Asymp. Sig. .016
3 5 15.00
4 4 4.50 a. Kruskal Wallis Test
Total 18 b. Grouping Variable: GROUP

There is an overall difference in mean upper central incisor width among the four aboriginal
groups (Kruskal Wallis test, 3 d.f., P=0.016, 2-tailed), which is the same conclusion reached using
a one-way ANOVA model (F=7.289, d.f.1 = 3, d.f.2=14, P=0.004). To find out which groups are
significantly different from the others, you will need to carry out pairwise comparisons using the
Wilcoxon sum rank test or the Mann Whitney U test, since there is not a non-parametric
equivalent to a Multiple Comparison Test (“post-hoc” test). Using the Wilcoxon sum rank tests in
SPSS, it was evident that group 1’s mean incisor width is different from those of groups 3 and 4,
and that group 3’s mean score is different from that of group 4. All the other pairwise
comparisons were not significant at the 5% level.

Suggested readings:
Norman GR, Streiner DL. Biostatistics. The bare essentials (2nd ed.). Hamilton, ON: B.C.
Decker Inc., 2000. Chapters 22 and 23.
Weintraub JA, Douglass CW, Gillings DB. Biostats. Data analysis for dental health care
professionals (2nd ed.). Research Triangle Park, NC: CAVCO Inc., 1985. Chapter 17.
Kim JS, Dailey RJ. Biostatistics for oral healthcare (1st ed.). Ames, IA: Blackwell Pub.
Professional, 2008. Chapter 14.
15
Prepared by Dr. Herenia P. Lawrence

You might also like