CORRELATION NOTES (PROBABILITY & STATISTICS)
CORRELATION NOTES (PROBABILITY & STATISTICS)
1.1 Introduction
Correlation is a statistical tool that helps to measure and analyses the degree of association/ relationship between
two variables.
The measure of correlation is called the correlation coefficient. The degree of relationship is expressed by
coefficient which range from -1 ≤ ≤ +1 . The direction of change is indicated by a sign.
The correlation analysis enables us to have an idea about the degree and direction of the relationship between the
two variables under study.
Positive correlation-The correlation is said to be positive correlation if the values of two variables change in the
same direction. i.e as variable X increases , Y is increasing or as variable X is decreases , Y is decreasing.
Example- Price and quantity supplied- An increase in prices will result in increase in quantities supplied and vice-
versa.
Negative correlation- The correlation is said to be negative when the variables change in opposite direction. i.e as
variable X increases , Y decreases or as variable X decreases , Y increases.
Example- Price and quantity demanded - An increase in prices will result in decrease in in quantities demanded
and vice-versa.
Zero correlation- zero correlation means no r/ship between the two variables . i.e change in one variable(X) is not
associated with variable (Y)
i. Scatter diagram
ii. Karl person’s coefficient of correlation
iii. Spearman’s rank correlation.
Examples of Correlation
Suppose we have measured the height and weight of 6 men. The results might be as follows:
A 66 150
B 72 159
C 65 138
D 69 145
E 64 128
F 70 165
A scatter diagram or scattergram is the name given to the method of representing these figures
graphically. On the diagram, the horizontal scale represents one of the variables (let’s say height)
while the other (vertical) scale represents the other variable ( weight). Each pair of measurements is
represented by one point on the diagram, as shown in Figure below:
Scattergram of Men’s Heights and Weights
Figure 1.4
Make sure that you understand how to plot the points on a scatter diagram, noting especially that:
ò Each point represents a pair of corresponding values.
ò The two scales relate to the two variables under discussion.
The term scatter diagram or scattergram comes from the scattered appearance of the points on the
chart.
Examining the scatter diagram of heights and weights, you can see that it shows up the fact that, by
and large, tall men are heavier than short men. This shows that some relationship exists between
men’s heights and weights. We express this in statistical terms by saying that the two variables,
height and weight are correlated. Figure 1.42 shows another example of a pair of correlated variables
(each point represents one production batch):
Cost of Production Compared with Impurity Contents
Figure 1.42
Here you see that, in general, it costs more to produce material with a low impurity content than it
does to produce material with a high impurity content. However, you should note that correlation
does not necessarily mean an exact relationship, for we know that, while tall men are usually heavy,
there are exceptions, and it is most unlikely that several men of the same height will have exactly the
same weight!
Degrees of Correlation
In order to generalise our discussion, and to avoid having to refer to particular examples such as
height and weight or impurity and cost, we will refer to our two variables as x and y. On scatter
diagrams, the horizontal scale is always the x scale and the vertical scale is always the y scale. There
are three degrees of correlation which may be observed on a scatter diagram. The two variables may
be:
(a) Perfectly Correlated
When the points on the diagram all lie exactly on a straight line (Figure 7.3):
Figure 1.43
(b) Uncorrelated
When the points on the diagram appear to be randomly scattered about, with no suggestion of
any relationship (Figure 7.4):
Figure 1.44
110 Correlation
Figure 1.45