100% found this document useful (1 vote)
96 views30 pages

Nonparametric Methods:: Analysis of Ranked Data

Uploaded by

Starlight Astro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
96 views30 pages

Nonparametric Methods:: Analysis of Ranked Data

Uploaded by

Starlight Astro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

Nonparametric Methods:

Analysis of Ranked Data

Chapter 18

McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Learning Objectives
LO1 Define a nonparametric test and when it is applied
LO2 Conduct the sign test for dependent samples using the binomial
and standard normal distributions as the test statistics.
LO3 Conduct a test of hypothesis for dependent samples using the
Wilcoxon signed-rank test.
LO4 Conduct and interpret the Wilcoxon rank-sum test for independent
samples.
LO5 Conduct and interpret the Kruskal-Wallis test for several
independent samples.
LO6 Compute and interpret Spearman’s coefficient of rank correlation.
LO7 Conduct a test of hypothesis to determine whether the correlation
among the ranks in the population is different from zero.

edited 2
18-2
1. Sign Test
 The Sign Test is based on the sign of a
difference between two related observations.
 Assumption regarding the shape of the
population of differences is NOT necessary.
 The binomial distribution is the test statistic for
small samples and the standard normal (z) for
large samples.
 The test requires dependent (related) samples.

edited 3
Procedure to conduct the test:
 Determine the sign (+ or -) of the difference
between pairs.
 Determine the number of usable pairs.
 Compare the number of positive (or negative)
differences to the critical value.
 n is the number of usable pairs (without ties), X
is the number of pluses or minuses, and the
binomial probability
 π = .5
edited 4
example
The director of information systems at
Samuelson Chemicals recommended that
an in-plant training program be instituted
for managers. The objective is to improve
the knowledge of database usage in
accounting, procurement, production, and
so on. A sample of 15 managers was
selected at random. A panel of database
experts determined the general level of
competence of each manager with respect
to using the database. Their competence
and understanding were rated as being
either outstanding, excellent, good, fair, or
poor. After the three-month training
program, the same panel of information
systems experts rated each manager
again. The two ratings (before and after)
are shown along with the sign of the
difference. A “+” sign indicates
improvement, and a “-” sign indicates that
the manager’s competence using
databases had declined after the training
program.

Did the in-plant training program effectively


increase the competence of the managers
using the company’s database? edited 5
LO2 Conduct the sign test for dependent samples using the
binomial and standard normal distributions as the test statistics.

The Sign Test


EXAMPLE
The director of information systems at Samuelson Chemicals recommended that an in-
 The Sign Test is based on the sign of a plant training program be instituted for managers. The objective is to improve the
knowledge of database usage in accounting, procurement, production, and so on. A
difference between two related sample of 15 managers was selected at random. A panel of database experts
observations. determined the general level of competence of each manager with respect to using the
database. Their competence and understanding were rated as being either outstanding,
 Assumption regarding the shape of the excellent, good, fair, or poor. After the three-month training program, the same panel of
information systems experts rated each manager again. The two ratings (before and
population of differences is NOT necessary. after) are shown along with the sign of the difference. A “+” sign indicates improvement,
and a “-” sign indicates that the manager’s competence using databases had declined
 The binomial distribution is the test statistic after the training program.
for small samples and the standard normal
(z) for large samples. Did the in-plant training program effectively increase the competence of the managers
using the company’s database?
 The test requires dependent (related)
samples.

Procedure to conduct the test:


 Determine the sign (+ or -) of the difference
between pairs.
 Determine the number of usable pairs.
 Compare the number of positive (or
negative) differences to the critical value.
 n is the number of usable pairs (without
ties), X is the number of pluses or minuses,
and the binomial probability
 π = .5

edited 6
18-6
answer
Step 1: State the Null and Alternative Hypotheses

H0: π ≤.5 (There is no increase in competence as a result of the in-plant


training program.)
H1: π >.5 (There is an increase in competence as a result of the in-plant
training program.)

Step 2: Select a level of significance.


We chose the .10 level.

Step 3: Decide on the test statistic.


It is the number of plus signs resulting from the experiment.
Step 4: Formulate a decision rule. In this example α is .10.
 The probability of 3 or fewer successes is .029, found by .000 + .001 + .006 + .
022.
 The probability of 11 or more successes is also .029. Adding the two probabilities
gives .058. This is the closest we can come to .10 without exceeding it.
 Hence, the decision rule for a two-tailed test would be to reject the null hypothesis
if there are 3 or fewer plus signs, or 11 or more plus signs.
edited 7
LO2

The Sign Test –


Example
Step 1: State the Null and Alternative Hypotheses

H0: π ≤.5 (There is no increase in competence as a


result of the in-plant training program.)
H1: π >.5 (There is an increase in competence as a
result of the in-plant training program.)

Step 2: Select a level of significance.


We chose the .10 level.

Step 3: Decide on the test statistic.


It is the number of plus signs resulting from the
experiment.

Step 4: Formulate a decision rule. In this example α


is .10.
 The probability of 3 or fewer successes is .029,
found by .000 + .001 + .006 + .022.
 The probability of 11 or more successes is also .
029. Adding the two probabilities gives .058. This is
the closest we can come to .10 without exceeding it.
 Hence, the decision rule for a two-tailed test would Step 5: Make a decision regarding the null hypothesis.
be to reject the null hypothesis if there are 3 or Eleven out of the 14 managers in the training course increased their
fewer plus signs, or 11 or more plus signs. database competency. The number 11 is in the rejection region,
which starts at 10, so the null hypothesis is rejected.
.
We conclude that the three-month training course was effective. It
increased the database competency of the managers.
edited 8
18-8
Step 5: Make a decision regarding the null
hypothesis.
Eleven out of the 14 managers in the
training course increased their database
competency. The number 11 is in the
rejection region, which starts at 10, so the
null hypothesis is rejected.

We conclude that the three-month training


course was effective. It increased the
database competency of the managers.9
edited
LO2

Normal Approximation
 If the number of observations in the sample is larger than 10, the normal distribution can be used to
approximate the binomial.

EXAMPLE Step 3: Select the test statistic.


The market research department of Cola, Inc., has been given Use Z-distribution
the assignment of testing a new soft drink. Two versions of the where µ=.50n and σ=.50 n
drink are considered—a rather sweet drink and a somewhat
bitter one. A preference test is to be conducted consisting of a Step 4: Formulate the decision rule.
sample of 64 consumers. Each consumer will taste both the Referring to Appendix B.1, Areas under the Normal Curve,
sweet cola (labeled A) and the bitter one (labeled B) and for a two-tailed test (because states that π ≠ .50), at the .05
indicate a preference. Conduct a test of hypothesis to significance level, the critical values are -1.96 and +1.96.
determine if there is a difference in the preference for the sweet
and bitter tastes. Use the .05 significance level. Step 5: Compute z, compare the computed value with
the critical value, and make a decision regarding H0
Step 1: State the null hypothesis and the alternate Preference for cola A was given a “+”sign and preference
hypothesis. for B a “-” sign. Out of the 64 in the sample, 42 preferred the
sweet taste, which is cola A. Therefore, there are 42 pluses.
H0: π = .50 There is no preference
Since 42 is more than n/2 =64/2=32, we use:
H1: π ≠ .50 There is a preference ( X  .50 )  .50n ( 42  .50 )  .50( 64 )
z   2.38
.50 n 0.50 64
Step 2: Select the level of significance. The computed z of 2.38 is beyond the critical value of 1.96.
α = 0.05 as stated in the problem Conclusion: The null hypothesis of no difference is rejected at the .05
significance level. There is evidence of a difference in consumer
edited preference. That is, we conclude consumers prefer one cola over
10
another. 18-10
Normal Approximation

edited 11
example
 The market research department  Step 1: State the null
of Cola, Inc., has been given the hypothesis and the alternate
assignment of testing a new soft hypothesis.
drink. Two versions of the drink
are considered—a rather sweet H0: π = .50 There is no
drink and a somewhat bitter one. preference
A preference test is to be H1: π ≠ .50 There is a
conducted consisting of a sample
of 64 consumers. Each consumer preference
will taste both the sweet cola  Step 2: Select the level of
(labeled A) and the bitter one significance.
(labeled B) and indicate a α = 0.05 as stated in the
preference. Conduct a test of
hypothesis to determine if there is problem
a difference in the preference for  Step 3: Select the test statistic.
the sweet and bitter tastes. Use
the .05 significance level.
Use Z-distribution
edited
where µ=.50n and σ=.5012
 Step 4: Formulate the  Preference for cola A was
decision rule. given a “+”sign and
 Referring to Appendix B.1, preference for B a “-” sign.
Areas under the Normal Out of the 64 in the sample,
Curve, for a two-tailed test 42 preferred the sweet
(because states that π ≠ . taste, which is cola A.
50), at the .05 significance Therefore, there are 42
level, the critical values are pluses. Since 42 is more
-1.96 and +1.96. than n/2 =64/2=32, we use:
 The computed z of 2.38 is beyond
 Step 5: Compute z, the critical value of 1.96.
compare the computed  Conclusion: The null hypothesis of no
value with the critical difference is rejected at the .05
significance level. There is evidence
value, and make a
of a difference in consumer
decision regarding H0 preference. That is, we conclude
consumers prefer one cola over
edited
another. 13
2. Wilcoxon Signed-Rank Test for Dependent
Samples

If the assumption of normality is EXAMPLE


violated for the paired-t test, use Fricker’s is a family restaurant chain located primarily in the
southeastern part of the United States. It offers a full dinner menu,
the Wilcoxon Signed-rank test. but its specialty is chicken. Recently, Fricker, the owner and
 The test requires the ordinal scale founder, developed a new spicy flavor for the batter in which the
of measurement. chicken is cooked. Before replacing the current flavor, he wants to
conduct some tests to be sure that patrons will like the spicy flavor
 The observations must be related better. Bernie selects a random sample of 15 customers, each
or dependent. customer is given a small piece of the current chicken and asked
to rate its overall taste on a scale of 1 to 20. A value near 20
indicates the participant liked the flavor, whereas a score near 0
The steps for the test are: indicates they did not like the flavor. Next, the same 15
participants are given a sample of the new chicken with the spicier
1. Compute the differences between flavor and again asked to rate its taste on a scale of 1 to 20.
related observations. Drop The results are reported in the table below. Is it reasonable to
observations with 0 difference conclude that the spicy flavor is preferred? Use the .05
from the sample. significance level.
2. Rank the absolute differences
from low to high.
3. Return the signs to the ranks and
sum positive and negative ranks.
4. Compare the smaller of the two
rank sums with the T value,
obtained from Appendix B.7.

edited 14
18-14
2a. Wilcoxon Signed-Rank Test for Dependent
Samples
The steps for the test are:
If the assumption of 1. Compute the differences
normality is violated for between related observations.
the paired-t test, use Drop observations with 0
the Wilcoxon Signed- difference from the sample.
rank test. 2. Rank the absolute differences
 The test requires the from low to high.
3. Return the signs to the ranks
ordinal scale of
and sum positive and negative
measurement.
ranks.
 The observations must
4. Compare the smaller of the
be related or two rank sums with the T
dependent. value, obtained from Appendix
edited
B.7. 15
example
Fricker’s is a family restaurant chain located
primarily in the southeastern part of the United
States. It offers a full dinner menu, but its
specialty is chicken. Recently, Fricker, the
owner and founder, developed a new spicy
flavor for the batter in which the chicken is
cooked. Before replacing the current flavor, he
wants to conduct some tests to be sure that
patrons will like the spicy flavor better. Bernie
selects a random sample of 15 customers,
each customer is given a small piece of the
current chicken and asked to rate its overall
taste on a scale of 1 to 20. A value near 20
indicates the participant liked the flavor,
whereas a score near 0 indicates they did not
like the flavor. Next, the same 15 participants
are given a sample of the new chicken with
the spicier flavor and again asked to rate its
taste on a scale of 1 to 20.
The results are reported in the table below. Is
it reasonable to conclude that the spicy flavor
is preferred? Use the .05 significance level.

edited 16
The steps
1. Compute the difference between spicy score
and the current score
2. Only the + and – differences are considered
further, 0 is dropped
3. Determine the absolute differences (col 4)
4. Rank the absolute diff from smallest to largest
5. R+ and R- must be separated
6. R+ and R- are totaled the smaller of the two
rank sums is used as the test statistic and
referred to as T
edited 17
LO3

2b. Wilcoxon Signed-Rank Test for


Dependent Samples - Example

The smaller of the two rank sums is used asedited


the test statistic and referred to as T. 18
18-18
LO3

Wilcoxon Signed-Rank Test for Dependent


Samples - Example

 The critical values for the Wilcoxon signed-rank test are


located in Appendix B.7. A portion of that table is shown
on the table below.

edited 19
18-19
Hypothesis:
H0: There is no difference in the ratings of the two flavors.
H1: The spicy ratings are higher.
Decision Rule:
Reject H0 if decision rule is to reject the null hypothesis if
the smaller of the rank sums is 25 or less.
Computed T = 30
Critical T = 25
Decision is not to reject the null hypothesis.
We cannot conclude there is a difference in the flavor ratings
between the current and the spicy.
edited 20
2b. Wilcoxon Rank-Sum Test for Independent Samples

 The Wilcoxon Rank-Sum Test is used to determine if two independent


samples came from the same or equal populations.
 No assumption about the shape of the population is required.
 The data must be at least ordinal scale.
 Each sample must contain at least eight observations.
 To determine the value of the test statistic W, all data values are ranked
from low to high as if they were from a single population.
 The sum of ranks for each of the two samples is determined.
 The data are ranked as if the observations were part of a single sample.
 The sum of ranks for each of the two samples is determined
 If the null hypothesis is true, then the ranks will be about evenly
distributed between the two samples, and the sum of the ranks for the
two samples will be about the same.

edited 21
18-21
edited 22
example
Dan Thompson, the president of
CEO Airlines, recently noted an
increase in the number of no-
shows for flights out of Atlanta.
He is particularly interested in
determining whether there are
more no-shows for flights that
originate from Atlanta compared
with flights leaving Chicago. A
sample of nine flights from
Atlanta and eight from Chicago
are reported on table.
At the .05 significance level, can
we conclude that there are
more no-shows for the flights
originating in Atlanta?

edited 23
LO4
Set up Hypothesis and Decision Rule:
Hypothesis:
H0: The population distribution of no-shows is the same or less for Atlanta and Chicago.
H1: The population distribution of no-shows is larger for Atlanta than for Chicago.
Decision Rule: Reject H0 if: computed Z > critical Z
.05 level of significance = 1.65 critical Z
Rank the observations from both samples as if they were a single group.

The Chicago flight with only 8 no-shows had the fewest, so it is assigned a rank of 1. The
Chicago flight with 9 no-shows is ranked 2, and so on.

edited 24
18-24
LO4

The value of W is calculated for the Atlanta group and is


found to be 96.5, which is the sum of the ranks for the no-
shows for the Atlanta flights.

The computed z value (1.49) is less than 1.65, the null hypothesis is not
rejected. It appears that the number of no-shows is the same in Atlanta as in
Chicago.
edited 25
18-25
LO5 Conduct and interpret the Kruskal-Wallis
test for several independent samples.
3. Kruskal-Wallis Test:
Analysis of Variance by Ranks
EXAMPLE
A management seminar consists of executives from manufacturing, finance,
This is used to compare three or more samples to determine if they came from
equal populations.
and engineering. Before scheduling the seminar sessions, the seminar leader
is interested in whether the three groups are equally knowledgeable about


The ordinal scale of measurement is required.
It is an alternative to the one-way ANOVA.
management principles. Plans are to take samples of the executives in
 The chi-square distribution is the test statistic. manufacturing, in finance, and in engineering and to administer a test to each


Each sample should have at least five observations.
The sample data is ranked from low to high as if it were a single
executive. If there is no difference in the scores for the three distributions, the
group. seminar leader will conduct just one session. However, if there is a difference
in the scores, separate sessions will be given. We will use the Kruskal-Wallis
test instead of ANOVA because the seminar leader is unwilling to assume that
(1) the populations of management scores follow the normal distribution or (2)
the population standard deviations are the same.

Step 1: Set up the Null and Alternate Hypotheses


H0: The population distributions of the management scores for
the populations of executives in manufacturing, finance, and
engineering are the same.
H1: The population distributions of the management scores for
the populations of executives in manufacturing, finance, and
engineering are NOT the same.

Step 2: State the Decision Rule


H0 is rejected if the computed H statistic is greater than critical χ2
value of 5.991 (There are 2 degrees of freedom at the .05 significance
level.)

edited 26
18-26
LO5
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example

Step 3: Collect Data and Compute the Chi-square Statistic


Considering the scores as a single population, the engineering executive with a score of 35 is the lowest, so it is
ranked 1. There are two scores of 38. To resolve this tie, each score is given a rank of 2.5, found by (2+3)/2. This
process is continued for all scores.
The scores, the ranks, and the sum of the ranks for each of the three samples are given in the table below.

Because the computed


value of H (5.736) is less
than the critical value of
5.991, the null hypothesis is
not rejected.

There is not enough


evidence to conclude there
is a difference among the
executives from
manufacturing, finance, and
engineering with respect to
their typical knowledge of
management principles.

edited 27
18-27
LO6 Compute and interpret Spearman’s
coefficient of rank correlation.
4. Rank-Order Correlation
EXAMPLE
Lorrenger Plastics, Inc., recruits management trainees at
colleges and universities throughout the United States.
Spearman’s coefficient of rank Each trainee is given a rating by the recruiter during the
correlation reports the association on-campus interview. This rating is an expression of future
between two sets of ranked potential and may range from 0 to 15, with the higher
observations. The features are: score indicating more potential. The recent college
graduate then enters an in-plant training program and is
given another composite rating based on tests, opinions of
 It can range from –1.00 up to 1.00. group leaders, training officers, and so on. The on-campus
rating and the in-plant training ratings are given in the table
 It is similar to Pearson’s coefficient of
on the right.
correlation, but is based on ranked data.

 It computed using the formula:

edited 28
18-28
LO6

Conclusion:

The value of .726 indicates a strong positive association between the ratings of the on-campus recruiter and the ratings of the
training staff.

The graduates that received high ratings from the on-campus recruiter also tended to be the ones that received high ratings
from the training staff.`
edited 29
18-29
scatter diagram of on campus and in plant scores
110

100

90

80
in plant

70

60

50

40

30
50 75 100 125 150 175 200
on campus

edited 30

You might also like