0% found this document useful (0 votes)
19 views

Lecture 02 - Student Version

intro to stats

Uploaded by

yashdugar.fun
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lecture 02 - Student Version

intro to stats

Uploaded by

yashdugar.fun
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

CORRELATION

CONTINUES

Lecture 02
Dr. Garima Chaklader
BRIEF REVISION:

cov (X, Y) = 9

σ(Y) = 7.5
Var (X) = 16

Calculate correlation coefficient


III) Rank Correlation: Spearman’s rank correlation coefficient

◦ A rank correlation the relationship between rankings of different variables.

◦ A rank correlation coefficient measures the degree of similarity and direction between two rankings.

◦ Unlike Karl Pearson’s correlation, Spearman’s rank correlation does not require the assumption that

there exists a linear relationship between variables

◦ It always lies between -1 and +1.


Sports Context:
◦ X = college basketball program & Y = college football program

◦ One could test for a relationship between poll rankings of the two types of program

◦ “Do colleges with a higher-ranked basketball program tend to have a higher-ranked football

program?”

◦ A rank correlation coefficient can measure that relationship.


Economic Context:
◦ X = Income rankings (low income, medium income and high income)

And Y = Education level (No education, high school, university)

◦ One could test for a relationship between rankings the two variables measuring the relationship between

income and educational level.


Spearman’s Rank Correlation:

d = difference
in ranks

n = number of
observations
Application: Stock Market
Two financial experts were asked to rank ten
stocks for the benefit of investors. The
rankings given by the experts are:

Stock Expert I Expert II


A 4 3
B 1 2
C 2 1
D 10 9 Ans = 0.83
E 9 5 The value of rank correlation coefficient
F 3 4 suggests that the similarity in ranking of the
stocks by two experts is high.
G 5 6
H 6 7
I 8 10
J 7 8
Concept: Tied Ranks!

Calculate the rank correlation coefficient for the


following data:

X Y
14 21
10 16
12 32
17 25
10 30
15 16
10 20
12 22
19 35
11 23
Concept: Tied Ranks!

Calculate the rank correlation coefficient for the


following data:
m1 (10) = number of repetitions of 1st rank = 3
X Y r1 r2 d d2
14 21 4 7 -3 9 m2 (12) = number of repetitions of 2nd rank = 2
10 16 9 9.5 -0.5 0.25
12 32 5.5 2 3.5 12.25 m3 (16) = number of repetitions of 3rd rank = 2
17 25 2 4 -2 4
10 30 9 3 6 36 • Substitute info in formula
15 16 3 9.5 -6.5 42.25
10 20 9 8 1 1 • Answer = 0.321
12 22 5.5 6 -0.5 0.25
19 35 1 1 0 0 • Thus, there is positive correlation between
11 23 7 5 2 4 two variables. The strength of correlation is
low.
Hypothesis Testing for correlation coefficient
◦ Null Hypothesis (H0) : r = 0

◦ Alternate Hypothesis (H1) : r ≠ 0

Question:
◦ Consider in above example of income and level of education, n=20, r = 0.8 and level of significance α = 0.05.

◦ 𝑡 = 0.8∗√20−2 / √1−0.82 = 5.66


◦ tc = t18, 0.05 = 2.1
◦ As 5.66 > 2.1 => reject the null hypothesis => statistical evidence the variables income and level of education are
correlated
Check this
out!
"Ice cream sales are correlated with homicides in New York“

Correlation • This hypothesis says that as the sales of ice cream rise and fall, the number of
homicides also rise and fall. But does ice cream consumption causing the death
does not of people?

imply • The answer is "No". If two things are correlated that doesn't mean that one will
Causation cause another.

• When two independent things are tied together, these can be either be bound by
causality or correlation. But in most cases, it is just a coincidence.
"Ice cream sales are correlated with homicides in New York“

• Generally, we collect some samples to test our hypothesis.


• What if our sample is not large enough and a little too biased?
Why • What if there is a hidden factor that we did not record in our sample dataset?
correlation • These things can influence the causation and correlation of two variables.
is not the • Here in this example, weather is the hidden factor

causation in
ice-scream What may be happening:

example? • Let's consider a sunny summer day. It is a lovely day to go outside, and people
are enjoying the beach.
• As the day is hot, people are buying ice creams as well as people are exposed to
accidents.
• The weather is causing a rise in the ice cream sale and homicides
Why
correlation
is not the
causation in
ice-scream
example?
◦Garima’s Lectures!
HAVE YOU EVER FILLED A CONSUMER SURVEY?
Calculate correlation coefficient
Q2: Average Salary: (Rs. ‘000)

100-150 150-200 200-250 250-300 300-350


Q1: 0-10 5 4 5 2 4
Expenditure
On 10-20 2 7 3 7 1
Entertainment 20-30 0 6 0 4 5
30-40 8 0 4 0 8
(Rs. ‘000)
40-50 0 7 3 5 10

You might also like