0% found this document useful (0 votes)
19 views

Lec3_permutation

This document covers the concepts of permutation tests, rank correlation, and dependence tests in nonparametric statistics. It explains how permutation tests can be used to compute p-values through resampling methods and introduces rank correlation coefficients, including Spearman's ρ and Kendall's τ, which are robust to outliers. The document includes examples to illustrate the application of these statistical methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lec3_permutation

This document covers the concepts of permutation tests, rank correlation, and dependence tests in nonparametric statistics. It explains how permutation tests can be used to compute p-values through resampling methods and introduces rank correlation coefficients, including Spearman's ρ and Kendall's τ, which are robust to outliers. The document includes examples to illustrate the application of these statistical methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

STAT 425: Introduction to Nonparametric Statistics Winter 2018

Lecture 3: Permutation, Rank Correlation, and Dependence Test


Instructor: Yen-Chi Chen

3.1 Permutation Test

Permutation test is a power method for conducting a two-sample test. The idea is very simple – given
a test statistic, we compute its distribution under H0 : PX = PY by permuting the two samples. Let
SX = {X1 , · · · , Xn } and SY = {Y1 , · · · , Ym } be the two samples. And let T (SX , SY ) be a test statistic (a
test can be the difference between the two sample means). Assume for simplicity, under H0 the test statistic
T (SX , SY ) has mean 0.

To see the distribution of T (SX , SY ) under H0 , we generate new samples SX and SY∗ by the following
permuting procedure:

1. Pulling both sample together to form a joint sample SXY = {X1 , · · · , Xn , Y1 , · · · , Ym }.


2. Randomly permuting the elements in SXY and split them into two samples SX and SY∗ and each with

n and m observations, respectively. Namely, the new SY may contain several observations that were
originally in SX .


3. Treating SX and SY∗ as the original sample, compute the test statistic T (SX

, SY∗ ).

4. Repeat the above two procedures several times, record the value of test statistic of each time.

After repeating the above permutation procedure B times, we obtain B values of the test statistic T ∗(1) , · · · , T ∗(B) .
Then we compute the p-value of testing H0 as

B
1 X  ∗(`) 
pvalue = 2 × I |T | ≥ |T (SX , SY )|
B
`=1

Example 1. Consider the example we have visited in Lecture 1:

SX = −2, −1, −1, 0, −2, −1, 0, −1, −2, 100


(3.1)
SY = 7, 13, 11, 5, 14, 9, 8, 10, 12, 11.

Now we assume that our test statistic is the median difference, i.e.,

T (SX , SY ) = med(SX ) − med(SY ) = −1 − 10.5 = −11.5.

The following picture shows the distribution of T (SX , SY ) a 10000 times permutation:

3-1
3-2 Lecture 3: Permutation, Rank Correlation, and Dependence Test

400
300
Frequency

200
100
0

−10 −5 0 5 10

The two red vertical lines show the observed value of T (SX , SY ) and −T (SX , SY ). The histogram displays
the distribution of T (SX , SY ) under H0 based on permutation. To compute the p-value, we need to count
the number of cases where |T ∗ | is less than or equal to |T (SX , SY )|. This is the same as counting the number
of T ∗ outside the two vertical red lines. There are totally 189 cases satisfying this condition, so the p-value
is 0.0189.
Why permutation test works? It works because under H0 , the two samples are from the same distribution.
Thus, randomly exchanging the elements in the two samples should give us a new set of data from the same
distribution.
Example 2. Here is another example where the permutation test is applied to two samples with different
sizes. Assume we have two samples

SX = −1.8, 0.4, 3.2, −2.3, −0.2, 0.3, 1.4, −0.5, 4.0, −0.3
(3.2)
SY = 0.2, 0.5, −0.2, −0.5, 0.9, −1.2, 0.4, 0.0, 0.5, 0.2, 1.0, −0.6.

This time we choose our test statistic as the difference between sample standard deviation

T (SX , SY ) = sX − sY = 1.99 − 0.64 = 1.35.

After permuting the data 10000 times, we obtain the following distribution:
Lecture 3: Permutation, Rank Correlation, and Dependence Test 3-3

700
600
500
400
Frequency

300
200
100
0

−2 −1 0 1 2

Again, the two vertical lines indicate the value of T (SX , SY ) and −T (SX , SY ). By counting the number of
permutations leading to |T ∗ | ≥ |T (SX , SY )|, we obtain a p-value 0.0259.

3.2 Rank Correlation

In the previous few lectures, we are working on two-sample test problems. Now we will move to a new
problem in statistics: analyzing a bivariate random sample. Assume that each of our observation has two
variable X and Y and we have n observations. The goal is to understand how the two variables associated
with each other. Our data can be described as a bivariate random sample
(X1 , Y1 ), · · · , (Xn , Yn ).

A common approach to investigate the relationship between X and Y is through their correlation coefficient,
also known as the Pearson’s correlation:
s2
rbXY = XY ,
sX sY
where
n
1X
s2XY = Xi Yi − X̄n Ȳn
n i=1
n
1X 2
s2X = X − (X̄n )2
n i=1 i
n
1X 2
s2Y = Y − (Ȳn )2 .
n i=1 i

The correlation coefficient measures the linear relationship between the two variables. However, similar to
the sample average, the correlation coefficient is vulnerable to the outliers.
Example 3. For example, consider a bivariate data
(X, Y ) = (−5, −5), (−4, −4), · · · , (3, 3), (4, 4), (5, −20).
3-4 Lecture 3: Permutation, Rank Correlation, and Dependence Test

The scatter plot is as follows:


0
−5
Y

−10
−15
−20

−4 −2 0 2 4

Apparently, the two variables are highly correlated except the last observation. However, the correlation
coefficient is about −0.07 because of the last observation.

3.2.1 Spearman’s ρ

Now we introduce a robust approach of calculating the correlation between two variables. This robust
approach is called the rank correlation and as you can expect, we will make a rank transformation. For
observations’ first variable (X1 , · · · , Xn ), we compute their ranks, denoting as R1 , · · · , Rn . Similarly, we
calculate the ranks of the second variables, denotes as S1 , · · · , Sn .
The Spearman’s ρ (also known as the Spearman’s coefficient) is

ρb = rbRS .

Namely, we use the correlation coefficients of the ranks as a measurement of correlation. In the previous
example, the Spearman’s ρ is 0.5, reflecting the result we have seen.
Similar to the rank test or sign test, the Spearman’s ρ is robust to outliers. This is a common feature of
all rank-based approach – they are robust to outliers. Actually, there are approaches of using median to
compare correlation coefficients using the contingency table method (we will learn it in the next lecture).
When there is no ties within X’s and within Y ’s, there is a simple formula for calculating ρb:

Pn
6 i=1 (Ri − Si )2
ρb = 1 − . (3.3)
n3 − n
Lecture 3: Permutation, Rank Correlation, and Dependence Test 3-5

To see this, we need to use two facts:


n
1X
s2XY = Xi Yi − X̄n Ȳn
n i=1
n
1X
= (Xi − X̄n )(Yi − Ȳn )
n i=1
n n
1 XX
= (Xi − Xj )(Yi − Yj )
2n2 i=1 j=1
n
1X 2
s2X = X − (X̄n )2
n i=1 i
n
1X
= (Xi − X̄n )2
n i=1
n n
1 XX
= 2 (Xi − Xj )2 .
2n i=1 j=1

Thus,

s2RS
ρb =
sR sS
Pn Pn
i=1 − Rj )(Si − Sj )
j=1 (Ri
= qP
n Pn 2
Pn Pn 2
i=1 j=1 (Ri − Rj ) i=1 j=1 (Si − Sj )
Pn Pn
i=1 j=1 (Ri − Rj )(Si − Sj )
= Pn Pn 2
i=1 j=1 (Ri − Rj )

Pn Pn Pn Pn
because i=1 j=1 (Ri − Rj )2 = i=1 j=1 (Si − Sj )2 . We first expand the numerator:

X n
n X n X
X n n X
X n n X
X n n X
X n
(Ri − Rj )(Si − Sj ) = Ri Si − Rj Si − Ri Sj + Rj Sj
i=1 j=1 i=1 j=1 i=1 j=1 i=1 j=1 i=1 j=1
Xn Xn X n
= 2n Ri Si − 2 Ri Sj
i=1 i=1 j=1
Xn n
X n
X
= 2n Ri Si − 2 Ri Sj
i=1 i=1 j=1
| {z } | {z }
=n(n+1)/2 n(n+1)/2
n
X n2 (n + 1)2
= 2n Ri Si − .
i=1
2

Pn
Now we consider i=1 (Ri − Si )2 :

n
X n
X n
X
(Ri − Si )2 = 2 Ri2 − 2 Ri Si .
i=1 i=1 i=1
3-6 Lecture 3: Permutation, Rank Correlation, and Dependence Test

Combing the above two equations together, we obtain


n X n n
X X n2 (n + 1)2
(Ri − Rj )(Si − Sj ) = 2n Ri Si −
i=1 j=1 i=1
2
n n
X X n2 (n + 1)2
= 2n Ri2 −n (Ri − Si )2 −
i=1 i=1
2
| {z }
n(n+1)(2n+1)/6
n
1 2 2 X
= n (n − 1) − n (Ri − Si )2 .
6 i=1

Because
n X
X n n
X n X
X n
(Ri − Rj )2 = 2n Ri2 − 2 Ri Rj
i=1 j=1 i=1 i=1 j=1
n
X n
X n
X
= 2n Ri2 −2 Ri Rj
i=1 i=1 j=1
| {z } | {z }
=n(n+1)(2n+1)/6 =n(n+1)/2
1 2 2
= n (n − 1).
6
Therefore, we obtain
Pn Pn
i=1 j=1 (Ri − Rj )(Si − Sj )
ρb = Pn Pn 2
i=1 j=1 (Ri − Rj )
1 2 2
Pn 2
6 n (n − 1) − n i=1 (Ri − Si )
= 1 2 2
n (n − 1)
Pn 6
6 i=1 (Ri − Si )2
= 1− ,
n(n2 − 1)
which is equation (3.3).

3.2.2 Kendall’s τ

In addition to the Spearman’s ρ, there is another nonparametric approach of measuring the correlation
between the two variables called the Kendall’s τ . The idea of Kendall’s τ is based on comparing pairs of
observations. For a pair of observations, say i-th and j-th observations, we say they are concordant if either
(1) Xi < Xj and Yi < Yj or (2) Xi > Xj and Yi > Yj . Namely, the (i, j) ordering is the same in both
variables. We say this pair os discordant if either (1) Xi < Xj and Yi > Yj or (2) Xi > Xj and Yi < Yj .
Note that if there is an equality, it is neither concordant or discordant.
We have n observations, so there are n(n−1)/2 distinct pairs. Let nc and nd denote the number of concordant
and discordant pairs, respectively. The Kendall’s τ is
nc − nd
τb = 1 . (3.4)
2 n(n − 1)
35
Using the data in Example 3, the Kendal’s τ is 55 ≈ 0.64 because there are 45 concordant pairs and 10
discordant pairs (please check this).
Lecture 3: Permutation, Rank Correlation, and Dependence Test 3-7

The intuition of Kendall’s τ is: When the two variables are positively correlated, there should be more
concordant pairs than the discordant pairs. On the other hand, if the two variables are negatively correlated,
more discordant pairs will be observed than the concordant pairs.
Note that if we define
Aij = sgn(Rj − Ri ), Bij = sgn(Sj − Si ),
then the Kendall’s τ can be written as
Pn
i,j=1 Aij Bij
τb = qP . (3.5)
n 2
Pn 2
i,j=1 Aij i,j=1 Bij

Think about why equation (3.4) and (3.5) are the same.

3.3 Independence Test

The correlation (coefficient) is a common quantity for describing the interaction between two random vari-
ables. But being correlated is just a special case of being dependent. In many situation, we may want to
know if two variables are dependent or not.
In this situation, we want to test
H0 : X and Y are independent. (3.6)
Of course, this null hypothesis implies
H0 : rXY = 0
so sometimes people test the dependence by testing if the correlation coefficient is significantly different from
0.
However, one has to be very careful: the two variables can be highly dependent but has 0 correlation. The
following is one example:
(X, Y ) = (−3, 9), (−2, 4), (−1, 1), (0, 0), (1, 1), (2, 4), (3, 9).
In fact, they are dependent because they are from Y = X 2 . But if you compute their correlation coefficient,
you will obtain rXY = 0! Thus, sometimes the correlation may provide us insufficient information about the
dependence.
There are many approaches for testing the dependence, we will focus on the two methods we have talked
about: the Spearman’s ρ and Kendall’s τ .
Under H0 of (3.6), the Spearman’s ρ has an asymptotic distribution
 
1
ρb ≈ N 0,
n−1
when the sample size is large. Thus, we can compute the p-value using
 p 
pvalue = 2 × Φ−1 − |b ρ|2 · (n − 1) ,

where Φ(t) is the CDF of the standard normal distribution.


In the case of Kendall’s τ , under H0 of (3.6),
 
2(2n + 5)
τb ≈ N 0, .
9n(n − 1)
3-8 Lecture 3: Permutation, Rank Correlation, and Dependence Test

Therefore, the p-value can be computed using


s !
τ |2 · 9n(n − 1)
|b
pvalue = 2 × Φ−1 − .
2(2n + 5)

Although testing the correlation coefficient may not be enough for determining the dependence in general,
being correlated and being dependent are equivalent when X and Y follow a bivariate normal distribution.
The two random variables X, Y follow a bivariate normal distribution if their PDF is

(x − µX )2 (y − µY )2
  
1 1 2r · (x − µX )(y − µY )
p(x, y) = √ exp − 2 + − ,
2πσX σY 1 − r2 2(1 − r2 ) σX σY2 σX σY
2
where µX = E(X), σX = Var(X) and r = rXY is the correlation coefficient. It is easy to see that if r = 0,
then
1 (x − µX )2 (y − µY )2
  
1
p(x, y) = exp − 2 +
2πσX σY 2 σX σY2
1 (x − µX )2 1 (y − µY )2
   
1 1
=p exp − 2 ×p exp −
2
2πσX 2 σX 2πσY2 2 σY2
= p(x)p(y),

which implies the independence.

You might also like