0% found this document useful (0 votes)
24 views

Lecture 21

Rik measures the strength of linear association. For measurement of general dependence, one can use Kendall's tau or Spearman's rho. A high t indicates the most pairs are concordant, indicating that the two rankings are consistent.

Uploaded by

amanmatharu22
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Lecture 21

Rik measures the strength of linear association. For measurement of general dependence, one can use Kendall's tau or Spearman's rho. A high t indicates the most pairs are concordant, indicating that the two rankings are consistent.

Uploaded by

amanmatharu22
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Lecture - 20

February 18, 2012

Multivariate Analysis:

Data Analysis: V ariable 1 V ariable 2 V ariable 3 x11 x12 x13 x21 x22 x23 x31 x32 x33 . . . . . . . . . . . . Obs n xn1 xn2 xn3 Obs 1 Obs 2 Obs 3 . . . V ariable p x1p x2p x3p . . . xnp

Notation : xjk = measurement of the k-th variable on the j-th item. In matrix notation: The j-th observation is xj = (xj1 , ..., xjk ) , and x11 x12 x13 x21 x22 x23 X = x31 x32 x33 . . . . . . . . . xn1 xn2 xn3 X= x1 x2 . . . xn

. . .

x1p x2p x3p . . . xnp

or

Descriptive statistics:
Sample means = xk = Sample variance: s2 = k
1 n 1 n1 n j=1

xjk ,

k = 1, , p. xk ) 2

n j=1 (xjk

Sample covariance: sik =

1 n1

n j=1 (xji

xi )(xjk xk ),

i, k = 1, , p

Sample correlation coecients: sik = rik = si sk Remarks on Correlation: 1. 1 rik 1 2. rik measures the strength of linear association. 3. rik is scale invariant. 4. rik is the referred to as the Pearsons correlation coecient. For measurement of general dependence (including nonlinear), one can use Kendalls tau or Spearmans rho.
n j=1 (xji n j=1 (xji

xi )(xjk xk )
n j=1 (xjk

xi ) 2

xk ) 2

Kendalls tau:
Let F be a continuous bivariate cumulative distribution function (CDF) of random variable x = (x1 , x2 ) . Let (X1 , X2 ) and (X1 , X2 ) be independent random pairs with distribution F . Then, Kendalls tau is = P r[(X1 X1 )(X2 X2 ) > 0] P [(X1 X1 )(X2 X2 ) < 0]. = 2P r[(X1 X1 )(X2 X2 ) > 0] 1 = 4 F dF 1. 2

Kendall- is also known as Kendall rank correlation coecient. It is a measure of the similarity of the orderings of the data when ranked by each of the quantities.

nc nd 1 n(n 1) 2

where nc is the number of concordant pairs and nd the number of discordant pairs in the data set. The denominator is the total number of pairs of the data. A high indicates the most pairs are concordant, indicating that the two rankings are consistent. By a corcondant pair, we mean sign(X2 X1 ) = sign(Y2 Y1 ), where sign(d) = 1, 0, 1 for d < 0, = 0, > 0, respectively. A pair is discordant if sign(X2 X1 ) = sign(Y2 Y1 ). Based on the sign function, Kendall- can be rewritten as =
i<j

sign(Xj Xi ) sign(Yj Yi )
n(n1) 2

Spearmans rho:
Let F be a continuous bivariate cumulative distribution function (CDF) of random variable x = (x1 , x2 ) . Let F1 and F2 be the two marginal CDF. Assume (X1 , X2 ) F . Then, Spearmans rho is the correlation of F1 (X1 ) and F2 (X2 ). Since F1 (X1 ) and F2 (X2 ) are uniform U (0, 1), their means are S = 12
1 2

and their variances are

1 . 12

Thus, Spearmans rho is

F1 (x1 )F2 (x2 )dF (x1 , x2 ) 3.

Spearmans rho is also a rank correlation co-ecient. For data (Xi , Yi )n , let (Ri , Si )n be i=1 i=1 the corresponding rank pairs and di = Ri Si be the dierence betwen the ranks. Since, R=S=
n+1 1 . n 2 n i=1 (Ri

n+1 2 ) 2

(n2 1) 12

The following equivalent forms of R are obtained: R= 12


n i=1 (Ri

R)(Si S) n(n2 1) 3

d2 i
i=1

=
i=1

(Ri R)2 +
i=1

(Si S)2 2
i=1

(Ri R)(Si S)

Therefore, the most common form of the spearman co-ecient of rank correlation is obtained as R=1 6 n d2 i=1 i n(n2 1)

Important Assumption : There is no ties in the data.

You might also like