4. Correlation Analysis
4. Correlation Analysis
ANALYSIS
Mr. Sunanda Das, Assistant Professor, Dept. of CSE, KUET
CORRELATION
• Correlation is the degree of inter-relatedness/associations among the
two or more variables.
• Correlation analysis is a process to find out the degree of relationship
between two or more variables by applying various statistical tools and
techniques.
Examples:
• Relationship between price and demand of a commodity
• Relationship between height and weight
Scope of Correlation Analysis
• The existence of correlation between two (or more) variables only
implies that these variables:
# Correlation analysis does not answer the questions like why there is
cause and effect between two variables.
Types of Correlation
Types of Correlation
On the basis of degree On the basis of number of On the basis of linearity
of correlation variables
• Simple correlation
• Positive correlation • Linear correlation
• Partial correlation
• Negative correlation • Non – linear correlation
• Multiple correlation
• Zero Correlation
Positive Correlation
• When two variables move in the same direction then the correlation between these
two variables is said to be Positive Correlation.
• When the value of one variable increases, the value of other value also increases at
the same rate.
Negative Correlation
• In this type of correlation, the two variables move in the opposite direction.
• When the value of one variable increases, the value of the other variable decreases.
Zero Correlation
• When the two variables are independent and the change in one
variable has no effect in other variable
Simple correlation
• Correlation is said to be simple when only two variables are analyzed.
Partial correlation
• When three or more variables are considered for analysis but only
two influencing variables are studied and rest influencing variables
are kept constant.
Rainfall, production of rice and price of rice are studied simultaneously will be
known are multiple correlation.
Linear correlation
If the change in amount of one variable tends to make changes in amount of
other variable bearing constant changing ratio it is said to be linear
correlation.
Non - Linear correlation
If the change in amount of one variable tends to make changes in
amount of other variable but not bearing constant changing ratio it is
said to be non – linear correlation.
Correlation Coefficient
• The correlation coefficient that indicates the strength of the
relationship between two variables. Ex. Pearson's correlation
coefficient.
• i.e In order to test the linear association between two variables x and
y we can use the Pearson correlation coefficient rxy
• The correlation coefficient takes values between -1 to 1
• 1: perfect/strong and positive linear correlation
• -1: perfect/strong and negative linear correlation
• 0: no linear correlation
• #An ordinal variable is a categorical variable for which the possible values are ordered. For example, suppose
you have a variable, economic status, with three categories (low, medium and high).
• In case u individuals receive the same rank, we describe it as a tied
rank of length u. In case of a tied rank, the above given formula is
changed to
• In this formula, tj represents the jth tie length and the summation
extends over the lengths of all the ties for both the series.
References
1. Probability & Statistics for Engineers & Scientists by Ronald E.
Walpole, Raymond H. Myers, Sharon L. Myers, Keying Ye
2. Probability and statistical inference by Robert V. Hogg, Elliot Tanis,
Dale Zimmerman
• https://blog.flexmr.net/correlation-analysis-definition-exploration
• https://www.mathsisfun.com/data/correlation.html