Session 4 Correlation analysis
Session 4 Correlation analysis
I n the previous chapter, we have studied non-parametric tests. Let us now move forward
and study correlation analysis.
We know that income and expenditure variables are interrelated, implying that they increase
or decrease together. Similarly, price and demand are also interrelated variables, implying
that when the price of a product increases, its demand decreases. Thus, we can say that
income and expenditure and price and demand are correlated with each other.
The technique for investigating the relationship between two continuous and quantitative
variables is called correlation. A correlation coefficient is denoted by r and ranges between –1
and +1. For instance, a value of r = .08 indicates a positive association between the variables,
whereas a value of r = –.03 indicates a negative association.
Or
Correlation Analysis 157
10.2 Covariance
Covariance measures the strength of relation between two or more sets of random
variables. In case of a single variable, variance is a measure of the average distance
of observations from the mean. The variance of a single scale variable can be
mathematically expressed as follows:
∑𝐧𝐢= 𝟏 (𝐱 𝐢 − 𝐱̅)𝟐
𝐕𝐚𝐫𝐢𝐚𝐧𝐜𝐞 (𝛔𝟐 ) =
𝐧 − 𝟏
Where, xi = ith observation
𝑥̅ = Mean of x observations
n = Number of observations
158 Chapter 10
In case of two scale variables, the relation between them can be estimated by the
type of change (deviations) in their observations from their respective mean. In
other words, if the deviation in observation of the first variable from the mean is of
the same nature as of the deviation in the corresponding observation of the second
variable from the mean, the variables may have a positive relationship between
them. The covariance between the two variables X and Y can be mathematically
expressed as follows:
∑(𝐱 𝐢 − 𝐱̅)(𝐲𝐢 − 𝐲̅)
𝐂𝐨𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞 (𝐱, 𝐲) =
𝐍−𝟏
r = Coefficient of correlation
The correlation coefficient of value +1 indicates the presence of a perfect positive
correlation between the variables. Conversely, the correlation coefficient of value –1
indicates the presence of a perfect negative correlation between the variables. The
correlation coefficient of value 0 indicates that there is no relationship between the
variables. The relationship between two variables can be further divided into the
following four categories, shown in Table 10.1:
Table 10.1: Ranges of Correlation Coefficient
Correlation Coefficient (r) = 0.15 No Correlation
Correlation Coefficient (r) = 0.15 to 0.35 Low Correlation
Correlation Coefficient (r) = 0.35 to 0.65 Medium Correlation
Correlation Coefficient (r) = 0.65 to 1 Strong Correlation
SPSS Command
The SPSS command required is as follows:
Step 1: Click ‘Analyze’ ➔ ‘Correlate’ ➔ ‘Bivariate’
Correlation Analysis 161
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
Figure 10.2: SPSS Command for Correlation Analysis (1)
Step 2: Transfer both the variables to the ‘Variables’ window, as shown in Figure
10.3:
162 Chapter 10
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
Figure 10.3: SPSS Command for Correlation Analysis (2)
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
Figure 10.4: SPSS Command for Correlation Analysis (3)
Training Performance
Score
N 90 90
N 90 90
Generally, this measure is used when the data is qualitative in nature. The equation
to calculate rank correlation is as follows:
Rank correlation = 1 – [6∑di2/n (n2 – 1)]
Where, di = difference between the individual/ith pair of variables
n = number of pairs of observations
The SPSS command for Spearman’s coefficient of correlation and Kendall’s tau
correlation coefficient is as follows:
Step 1: Click ‘Analyze’ ➔ ‘Correlate’ ➔ ‘Bivariate’
The same is shown in Figure 10.5:
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
Figure 10.5: SPSS Command for Non-Parametric Correlation Analysis (1)
Step 2: Select ‘Kendall’s tau-b’ and ‘Spearman’, deselect ‘Pearson’ and click ‘OK,’ as
shown in Figure 10.6:
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
166 Chapter 10
Scores1 Scores2
N 25 25
N 25 25
N 25 25
N 25 25
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
Figure 10.7: SPSS Command for Partial Correlation Analysis (1)
Correlation Analysis 169
Step 2: Transfer the two main variables to the ‘Variables’ window and the third
variable to the ‘Controlling for’ window. Then, click ‘OK,’ as shown in Figure 10.8:
(Copyright: IBM Corp. IBM SPSS Statistics for Windows, Version 21.0.)
Figure 10.8: SPSS Command for Partial Correlation Analysis (2)
N 40 40 40
N 40 40 40
N 40 40 40
Table 10.8 shows the output of the partial correlation obtained from SPSS:
Table 10.8: Partial Correlation Output
df 0 37
df 37 0
Table 10.8 shows that the p-value of the correlation between interest rate and GDP
(0.165) is found to be more than 5 percent level of significance (0.315). Hence, the
null hypothesis of no significant correlation can be accepted. This indicates that
after controlling the third variable money supply, the correlation between interest
rate and GDP becomes insignificant.