0% found this document useful (0 votes)
42 views

Math204 NonParOneTwo

1. The document describes non-parametric statistical tests for one and two samples, including the sign test for a single sample and the Mann-Whitney-Wilcoxon test for two independent samples. 2. The sign test uses the number of observations above or below the hypothesized median to calculate a test statistic and p-value, while the Mann-Whitney-Wilcoxon test ranks the combined samples and uses the sum of ranks for one sample as its test statistic. 3. Both tests can be approximated by z-tests for large samples, while exact significance levels are given in tables for small samples.

Uploaded by

Neeti Joshi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Math204 NonParOneTwo

1. The document describes non-parametric statistical tests for one and two samples, including the sign test for a single sample and the Mann-Whitney-Wilcoxon test for two independent samples. 2. The sign test uses the number of observations above or below the hypothesized median to calculate a test statistic and p-value, while the Mann-Whitney-Wilcoxon test ranks the combined samples and uses the sum of ranks for one sample as its test statistic. 3. Both tests can be approximated by z-tests for large samples, while exact significance levels are given in tables for small samples.

Uploaded by

Neeti Joshi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

N ON - PARAMETRIC S TATISTICS : O NE AND T WO S AMPLE T ESTS

Non-parametric tests are normally based on ranks of the data samples, and test hypotheses relating to
quantiles of the probability distribution representing the population from which the data are drawn.
Specifically, tests concern the population median, η, where

1
Pr[ Observation ≤ η ] =
2
The sample median, xMED , is the mid-point of the sorted sample; if the data x1 , . . . , xn are sorted into
ascending order, then 

 xm n odd, n = 2m + 1
xMED =
 xm + xm+1

n even, n = 2m
2

1. O NE S AMPLE T EST FOR M EDIAN : T HE S IGN T EST


For a single sample of size n, to test the hypothesis η = η0 for some specified value η0 we use the Sign
Test.. The test statistic S depends on the alternative hypothesis, Ha .

(a) For one-sided tests, to test

H0 : η = η0
Ha : η > η0

we define test statistic S by

S = Number of observations greater than η0

whereas to test

H0 : η = η0
Ha : η < η0

we define S by
S = Number of observations less than η0
If H0 is true, it follows that µ ¶
1
S ∼ Binomial n,
2
The p-value is defined by

p = Pr[X ≥ S]

where X ∼ Binomial(n, 1/2). The rejection region for significance level α is defined implicitly by
the rule
Reject H0 if α ≥ p.
The Binomial distribution is tabulated on pp 885-888 of McClave and Sincich.

1
(b) For a two-sided test,

H0 : η = η0
Ha : η 6= η0

we define the test statistic by


S = max{S1 , S2 }
where S1 and S2 are the counts of the number of observations less than, and greater than, η0
respectively. The p-value is defined by

p = 2 Pr[X ≥ S]

where X ∼ Binomial(n, 1/2).

Notes :
1. The only assumption behind the test is that the data are drawn independently from a continuous
distribution.

2. If any data are equal to η0 , we discard them before carrying out the test.

3. Large sample approximation. If n is large (say n ≥ 30), and X ∼ Binomial(n, 1/2), then it can be
shown that
X∼ : Normal(np, np(1 − p))
Thus for the sign test, where p = 1/2, we can use the test statistic
n n
S−S−
Z=r 2 =√ 2
1 1 1
n× × n×
2 2 2

and note that if H0 is true,


Z∼
: Normal(0, 1).
so that the test at α = 0.05 uses the following critical values

Ha : η > η0 then CR = 1.645


Ha : η < η0 then CR = −1.645
Ha : η =
6 η0 then CR = ±1.960

4. For the large sample approximation, it is common to make a continuity correction, where we
replace S by S − 1/2 in the definition of Z
µ ¶
1 n
S− −
2 2
Z= √ 1

2
Tables of the standard Normal distribution are given on p 894 of McClave and Sincich.

2
2. T WO S AMPLE T ESTS FOR I NDEPENDENT S AMPLES :
T HE M ANN -W HITNEY-W ILCOXON T EST
For a two independent samples of size n1 and n2 , to test the hypothesis of equal population medians

η1 = η2

we use the Wilcoxon Rank Sum Test, or an equivalent test, the Mann-Whitney U Test; we refer to this
as the
Mann-Whitney-Wilcoxon (MWW) Test
By convention it is usual to formulate the test statistic in terms of the smaller sample size. Without
loss of generality, we label the samples such that

n1 > n 2 .

The test is based on the sum of the ranks for the data from sample 2.
EXAMPLE : n1 = 4, n2 = 3 yields the following ranked data

SAMPLE 1 0.31 0.48 1.02 3.11


SAMPLE 2 0.16 0.20 1.97

SAMPLE 2 2 1 1 1 2 1
0.16 0.20 0.31 0.48 1.02 1.97 3.11
RANK 1 2 3 4 5 6 7

Thus the rank sum for sample 1 is


R1 = 3 + 4 + 5 + 7 = 19
and the rank sum for sample 2 is
R2 = 1 + 2 + 6 = 9.
Let η1 and η2 denote the medians from the two distributions from which the samples are drawn. We
wish to test
H0 : η1 = η2
Two related test statistics can be used

• Wilcoxon Rank Sum Statistic


W = R2
• Mann-Whitney U Statistic
n2 (n2 + 1)
U = R2 −
2
We again consider three alternative hypotheses:

Ha : η1 < η2
Ha : η1 > η2
Ha : η1 = η2

and define the rejection region separately in each case.

3
Large Sample Test
If n2 ≥ 10, a large sample test based on the Z statistic
n1 n2
U−
Z=r 2
n1 n2 (n1 + n2 + 1)
12
can be used. Under the hypothesis H0 : η1 = η2 ,

Z∼
: Normal(0, 1)

so that the test at α = 0.05 uses the following critical values

Ha : η1 > η2 then CR = −1.645


Ha : η1 < η2 then CR = 1.645
Ha : η1 =
6 η2 then CR = ±1.960

Small Sample Test


If n1 < 10, an exact but more complicated test can be used. The test statistic is R2 (the sum of the ranks
for sample 2). The null distribution under the hypothesis H0 : η1 = η2 can be computed, but it is
complicated.
The table on p. 832 of McClave and Sincich gives the critical values (TL and TU ) that determine the
rejection region for different n1 and n2 values up to 10.

• One-sided tests:

Ha : η1 > η2 Rejection Region is R2 ≤ TL


Ha : η1 < η2 Rejection Region is R2 ≥ TU

These are tests at the α = 0.025 significance level.


• Two-sided tests:

Ha : η1 6= η2 Rejection Region is R2 ≤ TL or R2 ≥ TU

This is a test at the α = 0.05 significance level.

Notes :
1. The only assumption is are needed for the test to be valid is that the samples are independently
drawn from two continuous distributions.

2. The sum of the ranks across both samples is

(n1 + n2 )(n1 + n2 + 1)
R1 + R2 =
2
3. If there are ties (equal values) in the data, then the rank values are replaced by average rank
values.
DATA VALUE 0.16 0.20 0.31 0.31 0.48 1.97 3.11
ACTUAL RANK 1 2 3 3 5 6 7
AVERAGE RANK 1 2 3.5 3.5 5 6 7

You might also like