Nonparametric Methods:: Analysis of Ranked Data
Nonparametric Methods:: Analysis of Ranked Data
Chapter 18
McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Learning Objectives
LO1 Define a nonparametric test and when it is applied
LO2 Conduct the sign test for dependent samples using the binomial
and standard normal distributions as the test statistics.
LO3 Conduct a test of hypothesis for dependent samples using the
Wilcoxon signed-rank test.
LO4 Conduct and interpret the Wilcoxon rank-sum test for independent
samples.
LO5 Conduct and interpret the Kruskal-Wallis test for several
independent samples.
LO6 Compute and interpret Spearman’s coefficient of rank correlation.
LO7 Conduct a test of hypothesis to determine whether the correlation
among the ranks in the population is different from zero.
edited 2
18-2
1. Sign Test
The Sign Test is based on the sign of a
difference between two related observations.
Assumption regarding the shape of the
population of differences is NOT necessary.
The binomial distribution is the test statistic for
small samples and the standard normal (z) for
large samples.
The test requires dependent (related) samples.
edited 3
Procedure to conduct the test:
Determine the sign (+ or -) of the difference
between pairs.
Determine the number of usable pairs.
Compare the number of positive (or negative)
differences to the critical value.
n is the number of usable pairs (without ties), X
is the number of pluses or minuses, and the
binomial probability
π = .5
edited 4
example
The director of information systems at
Samuelson Chemicals recommended that
an in-plant training program be instituted
for managers. The objective is to improve
the knowledge of database usage in
accounting, procurement, production, and
so on. A sample of 15 managers was
selected at random. A panel of database
experts determined the general level of
competence of each manager with respect
to using the database. Their competence
and understanding were rated as being
either outstanding, excellent, good, fair, or
poor. After the three-month training
program, the same panel of information
systems experts rated each manager
again. The two ratings (before and after)
are shown along with the sign of the
difference. A “+” sign indicates
improvement, and a “-” sign indicates that
the manager’s competence using
databases had declined after the training
program.
edited 6
18-6
answer
Step 1: State the Null and Alternative Hypotheses
Normal Approximation
If the number of observations in the sample is larger than 10, the normal distribution can be used to
approximate the binomial.
edited 11
example
The market research department Step 1: State the null
of Cola, Inc., has been given the hypothesis and the alternate
assignment of testing a new soft hypothesis.
drink. Two versions of the drink
are considered—a rather sweet H0: π = .50 There is no
drink and a somewhat bitter one. preference
A preference test is to be H1: π ≠ .50 There is a
conducted consisting of a sample
of 64 consumers. Each consumer preference
will taste both the sweet cola Step 2: Select the level of
(labeled A) and the bitter one significance.
(labeled B) and indicate a α = 0.05 as stated in the
preference. Conduct a test of
hypothesis to determine if there is problem
a difference in the preference for Step 3: Select the test statistic.
the sweet and bitter tastes. Use
the .05 significance level.
Use Z-distribution
edited
where µ=.50n and σ=.5012
Step 4: Formulate the Preference for cola A was
decision rule. given a “+”sign and
Referring to Appendix B.1, preference for B a “-” sign.
Areas under the Normal Out of the 64 in the sample,
Curve, for a two-tailed test 42 preferred the sweet
(because states that π ≠ . taste, which is cola A.
50), at the .05 significance Therefore, there are 42
level, the critical values are pluses. Since 42 is more
-1.96 and +1.96. than n/2 =64/2=32, we use:
The computed z of 2.38 is beyond
Step 5: Compute z, the critical value of 1.96.
compare the computed Conclusion: The null hypothesis of no
value with the critical difference is rejected at the .05
significance level. There is evidence
value, and make a
of a difference in consumer
decision regarding H0 preference. That is, we conclude
consumers prefer one cola over
edited
another. 13
2. Wilcoxon Signed-Rank Test for Dependent
Samples
edited 14
18-14
2a. Wilcoxon Signed-Rank Test for Dependent
Samples
The steps for the test are:
If the assumption of 1. Compute the differences
normality is violated for between related observations.
the paired-t test, use Drop observations with 0
the Wilcoxon Signed- difference from the sample.
rank test. 2. Rank the absolute differences
The test requires the from low to high.
3. Return the signs to the ranks
ordinal scale of
and sum positive and negative
measurement.
ranks.
The observations must
4. Compare the smaller of the
be related or two rank sums with the T
dependent. value, obtained from Appendix
edited
B.7. 15
example
Fricker’s is a family restaurant chain located
primarily in the southeastern part of the United
States. It offers a full dinner menu, but its
specialty is chicken. Recently, Fricker, the
owner and founder, developed a new spicy
flavor for the batter in which the chicken is
cooked. Before replacing the current flavor, he
wants to conduct some tests to be sure that
patrons will like the spicy flavor better. Bernie
selects a random sample of 15 customers,
each customer is given a small piece of the
current chicken and asked to rate its overall
taste on a scale of 1 to 20. A value near 20
indicates the participant liked the flavor,
whereas a score near 0 indicates they did not
like the flavor. Next, the same 15 participants
are given a sample of the new chicken with
the spicier flavor and again asked to rate its
taste on a scale of 1 to 20.
The results are reported in the table below. Is
it reasonable to conclude that the spicy flavor
is preferred? Use the .05 significance level.
edited 16
The steps
1. Compute the difference between spicy score
and the current score
2. Only the + and – differences are considered
further, 0 is dropped
3. Determine the absolute differences (col 4)
4. Rank the absolute diff from smallest to largest
5. R+ and R- must be separated
6. R+ and R- are totaled the smaller of the two
rank sums is used as the test statistic and
referred to as T
edited 17
LO3
edited 19
18-19
Hypothesis:
H0: There is no difference in the ratings of the two flavors.
H1: The spicy ratings are higher.
Decision Rule:
Reject H0 if decision rule is to reject the null hypothesis if
the smaller of the rank sums is 25 or less.
Computed T = 30
Critical T = 25
Decision is not to reject the null hypothesis.
We cannot conclude there is a difference in the flavor ratings
between the current and the spicy.
edited 20
2b. Wilcoxon Rank-Sum Test for Independent Samples
edited 21
18-21
edited 22
example
Dan Thompson, the president of
CEO Airlines, recently noted an
increase in the number of no-
shows for flights out of Atlanta.
He is particularly interested in
determining whether there are
more no-shows for flights that
originate from Atlanta compared
with flights leaving Chicago. A
sample of nine flights from
Atlanta and eight from Chicago
are reported on table.
At the .05 significance level, can
we conclude that there are
more no-shows for the flights
originating in Atlanta?
edited 23
LO4
Set up Hypothesis and Decision Rule:
Hypothesis:
H0: The population distribution of no-shows is the same or less for Atlanta and Chicago.
H1: The population distribution of no-shows is larger for Atlanta than for Chicago.
Decision Rule: Reject H0 if: computed Z > critical Z
.05 level of significance = 1.65 critical Z
Rank the observations from both samples as if they were a single group.
The Chicago flight with only 8 no-shows had the fewest, so it is assigned a rank of 1. The
Chicago flight with 9 no-shows is ranked 2, and so on.
edited 24
18-24
LO4
The computed z value (1.49) is less than 1.65, the null hypothesis is not
rejected. It appears that the number of no-shows is the same in Atlanta as in
Chicago.
edited 25
18-25
LO5 Conduct and interpret the Kruskal-Wallis
test for several independent samples.
3. Kruskal-Wallis Test:
Analysis of Variance by Ranks
EXAMPLE
A management seminar consists of executives from manufacturing, finance,
This is used to compare three or more samples to determine if they came from
equal populations.
and engineering. Before scheduling the seminar sessions, the seminar leader
is interested in whether the three groups are equally knowledgeable about
The ordinal scale of measurement is required.
It is an alternative to the one-way ANOVA.
management principles. Plans are to take samples of the executives in
The chi-square distribution is the test statistic. manufacturing, in finance, and in engineering and to administer a test to each
Each sample should have at least five observations.
The sample data is ranked from low to high as if it were a single
executive. If there is no difference in the scores for the three distributions, the
group. seminar leader will conduct just one session. However, if there is a difference
in the scores, separate sessions will be given. We will use the Kruskal-Wallis
test instead of ANOVA because the seminar leader is unwilling to assume that
(1) the populations of management scores follow the normal distribution or (2)
the population standard deviations are the same.
edited 26
18-26
LO5
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example
edited 27
18-27
LO6 Compute and interpret Spearman’s
coefficient of rank correlation.
4. Rank-Order Correlation
EXAMPLE
Lorrenger Plastics, Inc., recruits management trainees at
colleges and universities throughout the United States.
Spearman’s coefficient of rank Each trainee is given a rating by the recruiter during the
correlation reports the association on-campus interview. This rating is an expression of future
between two sets of ranked potential and may range from 0 to 15, with the higher
observations. The features are: score indicating more potential. The recent college
graduate then enters an in-plant training program and is
given another composite rating based on tests, opinions of
It can range from –1.00 up to 1.00. group leaders, training officers, and so on. The on-campus
rating and the in-plant training ratings are given in the table
It is similar to Pearson’s coefficient of
on the right.
correlation, but is based on ranked data.
edited 28
18-28
LO6
Conclusion:
The value of .726 indicates a strong positive association between the ratings of the on-campus recruiter and the ratings of the
training staff.
The graduates that received high ratings from the on-campus recruiter also tended to be the ones that received high ratings
from the training staff.`
edited 29
18-29
scatter diagram of on campus and in plant scores
110
100
90
80
in plant
70
60
50
40
30
50 75 100 125 150 175 200
on campus
edited 30