0% found this document useful (0 votes)
64 views

Module - 2 Correlation Analysis: Contents: 2.2 Types of Correlation

This document discusses correlation analysis and various correlation techniques. It covers: 1. Types of correlation including positive vs negative, simple vs partial vs multiple, and linear vs non-linear correlations. 2. Degrees of correlation ranging from perfect correlation to limited or no correlation. 3. Methods for determining correlation including scatter plots, Karl Pearson's coefficient of correlation, and Spearman's rank-correlation coefficient. The document provides examples and explanations of how to calculate and interpret these various correlation analyses.

Uploaded by

akj1992
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Module - 2 Correlation Analysis: Contents: 2.2 Types of Correlation

This document discusses correlation analysis and various correlation techniques. It covers: 1. Types of correlation including positive vs negative, simple vs partial vs multiple, and linear vs non-linear correlations. 2. Degrees of correlation ranging from perfect correlation to limited or no correlation. 3. Methods for determining correlation including scatter plots, Karl Pearson's coefficient of correlation, and Spearman's rank-correlation coefficient. The document provides examples and explanations of how to calculate and interpret these various correlation analyses.

Uploaded by

akj1992
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

Module - 2 Correlation Analysis

Contents: 2.1 Introduction 2.2 Types of Correlation


2.2.1 2.2.2 2.2.3 Positive and Negative Simple, partial and multiple Linear and non-linear

2.3 Degrees of Correlation


2.3.1 2.3.2 2.3.3 Perfect correlation Limited degrees of correlation Absence of correlation

2.4 Methods of Determining Correlation


2.4.1 2.4.2 2.4.3 Scatter Plot Karl Pearsons coefficient of correlation Spearmans Rank-correlation coefficient

Module -2 Correlation Analysis

2.1 Introduction
Correlation is a statistical technique that can show whether and how strongly pairs of variables are related. For example, height and weight are related; taller people tend to be heavier than shorter people. The relationship isn't perfect. People of the same height vary in weight, and you can easily think of two people you know where the shorter one is heavier than the taller one. Nonetheless, the average weight of people 5'5'' is less than the average weight of people 5'6'', and their average weight is less than that of people 5'7'', etc. Correlation can tell you just how much of the variation in peoples' weights is related to their heights. Although this correlation is fairly obvious your data may contain unsuspected correlations. You may also suspect there are correlations, but don't know which are the strongest. An intelligent correlation analysis can lead to a greater understanding of your data.

2.2 Types of Correlation


I. Positive and Negative II. III. Simple, partial and multiple Linear and non-linear

2.2.1 Positive and Negative Correlation


Positive Correlation If the higher scores on X are generally paired with the higher scores on Y, and the lower scores on X are generally paired with the lower scores on Y, then the direction of the correlation between two variables is positive. Negative Correlation If the higher scores on X are generally paired with the lower scores on Y, and the lower scores on X are generally paired with the higher scores on Y, then the direction of the correlation between two variables is negative.

2.2.2

Simple, partial and multiple

The distinction between simple, partial and multiple Correlation is based upon the number of variables studied Simple Correlation Correlation between only two variables, e.g. Correlation between age and height, correlation between yield of rice and amount of rainfall in a given area are examples of Simple Correlation Multiple Correlation When correlation between three or more variables are studied simultaneously, then it is called multiple Correlation Partial Correlation In this we recognize more than two variables but consider only two variables to be influencing each other, the effect of other influencing variable being kept constant. The correlation between the two variables keeping the other variables constant is called partial correlation 1 2 3 4 5 X1-Yield of rice X2-Amount of Rainfall X3-Amount of fertilizers X4-Type of soil X5-Advanced technologies used.

Correlation analysis of X1, X2, X3, X4 and X5 is an example of Multiple Correlation whereas if we only study the relation between X1 and X2 keeping other variables constant it would be an example of Partial Correlation between yield of rice and amount of rainfall.

2.2.3

Linear and non-linear

The nature of the graph gives us the idea of the linear type of correlation between two variables. If the graph is in

a straight line, the correlation is called a "linear correlation" and if the graph is not in a straight line, the correlation is non-linear or curvi-linear

2.3 Degrees of Correlation


2.3.1 Perfect correlation
If two variables changes in the same direction and in the same proportion, the correlation between the two is perfect positive. According to Karl Pearson the coefficient of correlation in this case is +1. On the other hand if the variables change in the opposite direction and in the same proportion, the correlation is perfect negative. its coefficient of correlation is -1. In practice we rarely come across these types of correlations.

2.3.2 Limited degrees of correlation


If two variables are not perfectly correlated or is there a perfect absence of correlation, then we term the correlation as Limited correlation. It may be positive, negative or zero but lies with the limits 1.

2.3.3 Absence of correlation


If two series of two variables exhibit no relations between them or change in variable does not lead to a change in the other variable, then we can firmly say that there is no correlation or absurd correlation between the two variables. In such a case the coefficient of correlation is 0. Meaning of (r) in the Correlation Coefficient Relationship Between X and Y r = + 1.0 Strong - Positive r = + 0.5 Weak - Positive r=0 r = - 0.5 r = - 1.0 - No Correlation Weak - Negative Strong - Negative As X goes up, Y always also goes up As X goes up, Y tends to usually also go up X and Y are not correlated As X goes up, Y tends to usually go down As X goes up, Y always goes down

2.4 Methods of Determining Correlation


1 2 3 Scatter Plot Karl Pearsons coefficient of correlation Spearmans Rank-correlation coefficient.

2.4.1 Scatter Plot (Scatter diagram or dot diagram)


In this method the values of the two variables are plotted on a graph paper. One is taken along the horizontal ((xaxis) and the other along the vertical (y-axis). By plotting the data, we get points (dots) on the graph which are generally scattered and hence the name Scatter Plot. The manner in which these points are scattered, suggest the degree and the direction of correlation. The

degree of correlation is denoted by r and its direction is given by the signs positive and negative. Example :

positive correlation

negative correlation

no correlation

2.4.2 Karl Pearsons coefficient of correlation


It gives the numerical expression for the measure of correlation. It is noted by r . The value of r gives the magnitude of correlation and sign denotes its direction. It is defined as r= . Cov (x,y) (Var x .Var y) Example: Correlation coefficient between advertisement expenditure(X) and sales (Y) X lakhs) 4 6 10 5 1 2 3 X =31 (Rs. Y (Rs. (X- X Mean)2 0.1849 2.4649 31.0249 0.3249 11.7649 5.9049 2.0449 (XX =53.7143 X Mean = 4.43 and Y Mean = 17.29 Sum of squared deviations in advertisement expenditure = 53.71 (Y- Y Mean)2 1.6641 137.124 661.004 5.7100 204.204 176.624 127.464 (YY =1310.794 (X - X Mean) (Y -Y Mean) 0.5547 18.4789 143.2047 1.5447 49.0147 32.2947 16.1447 (X - X =261.2371 .

crore) 16 29 43 20 3 4 6 Y =121

Mean

)2

Mean

)2

Mean

(Y

-Y

Mean

Sum of squared deviations of sales = 1310.79 Sum of cross products (SP) = 261.24
Calculation of the Pearson r r= 261.24 (53.71) (1310.79) = 261.24 70402.53 .

r = (261.24) / (265.33) = +0.985 Interpretation The magnitude of the correlation between advertisement expenditure and sales = 0.985. The direction of the relationship is positive. As the advertisement expenditure increases so does the sales of the commodity.

2.4.3 Spearmans Rank-correlation coefficient


The most precise way to compare several pairs of data is to use a statistical test - this establishes whether the correlation is really significant or if it could have been the result of chance alone. Spearmans Rank correlation coefficient is a technique which can be used to summarise the strength and direction (negative or positive) of a relationship between two variables. The result will always be between 1 and minus 1. Method - calculating the coefficient Create a table from your data. Rank the two data sets. Ranking is achieved by giving the ranking '1' to the biggest number in a column, '2' to the second biggest value and so on. The smallest value in the column will get the lowest ranking. This should be done for both sets of measurements. Tied scores are given the mean (average) rank. For example, the three tied scores of 1 euro in the example below are ranked fifth in order of price, but occupy three positions (fifth, sixth and seventh) in a ranking hierarchy of ten. The mean rank in this case is calculated as (5+6+7) 3 = 6. Find the difference in the ranks (d): This is the difference between the ranks of the two values on each row of the table. The rank of the second value (price) is subtracted from the rank of the first (distance from the museum). Square the differences (d) To remove negative values and then sum them (d 2 ).

Convenience Store 1 2

Distance CAM (m) 50 175

from

Rank

Price of 50cl bottle ()

Rank

Difference the ranks (d)

between

d2

10 9

1.80 1.20

2 3.5

8 5.5

64 30.25

3 4 5 6 7 8 9 10

270 375 425 580 710 790 890 980

8 7 6 5 4 3 2 1

2.00 1.00 1.00 1.20 0.80 0.60 1.00 0.85

1 6 6 3.5 9 10 6 8

7 1 0 1.5 -5 -7 -4 -7

49 1 0 2.25 25 49 16 49 d 285.5 =

Data Table: Spearman's Rank Correlation Calculate the coefficient (R) using the formula below. The answer will always be between 1.0 (a perfect positive correlation) and -1.0 (a perfect negative correlation). When written in mathematical notation the Spearman Rank formula looks like this :

Now to put all these values into the formula. Find the value of all the d2 values by adding up all the values in the Difference 2 column. In our example this is 285.5. Multiplying this by 6 gives 1713. Now for the bottom line of the equation. The value n is the number of sites at which you took measurements. This, in our example is 10. Substituting these values into n - n we get 1000 - 10 We now have the formula: R = 1 - (1713/990) which gives a value for R: 1 - 1.73 = -0.73.

What does this R value of -0.73 mean The R value of -0.73 suggests a fairly strong negative relationship.

You might also like