ADBM 103 -Introduction to Correlation - Session 4
ADBM 103 -Introduction to Correlation - Session 4
103
Introduction to Correlation
By
Nilani Gunasekara
M.Sc. (USJP), B.Sc.(joint major) Hons.(Wayamba),
MIM(SL),CIMA Cert
8/16/2019
Descriptive Content
1. Introduction to statistics, the process and its practical application, Sampling and different
data collection procedures
2. Data analysis techniques – Central tendency measurements in decision making
3. Data analysis techniques - Dispersion measurement techniques
4. Data Visualization
5. Correlation
6. Regression analysis
7. Time series analysis – Forecasting
8. Data Mining
9. Probabilistic approach in statistics – Normal distribution
10. Design of experiments
8/16/2019
Learning Objectives
8/16/2019
Correlation Analysis
Correlation analysis is used to measure the strength of the
relationship between two variables.
Is there a relationship between
the attendance percentage of students and their final marks?
monthly income and savings?
demand and price of a product?
advertising expenditure and sales revenue?
If two variables are related to any extent, then changes in the
value of one are related to changes in the value of the other.
Such variables are examples of bivariate data that comes in pairs
e.g. (x, y) and there may or may not be a relationship between them.
8/16/2019
Correlation Is Not Causation
"Correlation Is Not Causation" ... by that : when there is
a correlation it does not mean that one thing causes
the other. Ice-cream sales are
correlated with the level
of house burglaries
during warmer weather
when more people are on
holidays (vacation).
But ice-cream doesn’t
cause burglaries, nor vice
versa.
8/16/2019
https://
youtu.be/
taA0DWqi_
jM
8/16/2019
Scatter plot
A graphical sketch of the pairs (X, Y) is called a scatter plot
A scatter plot can be used to show the relationship between two
numerical variables
Types of Relationships
Linear Curvilinear
relationships relationships
Y Y
X X
Y Y
8/16/2019 X X
Types of Correlation
Positive No Correlation
Correlation
Y Y
X X
Negative If both variables increase together they
Correlation are said to be positively correlated.
Y
If one variable increases as the other
decreases they are said to be
negatively correlated
If no linear pattern can be seen there is
8/16/2019
X said to be no correlation.
Ex1:
In the study of a city, the population density, in people/hectare,
and the distance from the city centre, in km, was investigated by
picking a number of sample areas with the following results.
8/16/2019
8/16/2019
If a horizontal line is drawn through the mean y value,
and a vertical line through the mean x value, you can
see the relationship between the two variables in
another way.
8/16/2019
8/16/2019
The Coefficient of Correlation (Product Moment
Correlation Coefficient (r)
The correlation coefficient is sometimes referred to as the Pearson
product moment correlation coefficient in honor of its developer
Karl Pearson.
Pearson correlation or Pearson's correlation, is used to assess the
strength and direction of association between two linearly related,
quantitative variables
Pearson correlation coefficient works best when the variables are
approximately normally distributed and have no outliers (A scatter
plot can reveal these possible problems).
It can range from -1.00 to 1.00.
Negative values indicate an inverse relationship
Positive values indicate a direct relationship.
8/16/2019
The strength and direction of the
correlation
8/16/2019
Scatter Plots of Data with Various Correlation
Coefficients
8/16/2019
Coefficient of correlation (r)
n(∑ XY ) − (∑ X )(∑ Y )
r=
2 2 2 2
[ n(∑ X ) − (∑ X ) ][n(∑ Y ) − (∑ Y ) ]
8/16/2019
Exercises
1. The following sample observations were randomly
selected.
2.
8/16/2019
Coefficient of correlation with
Excel
1. CORREL Function
2.
3.
8/16/2019
Analysis Toolpak add-in in Excel
1.
2.
8/16/2019
1. Bi-lo Appliance Super-Store has outlets in several large
metropolitan areas in New England. The general sales manager
declared a commercial for a digital camera on selected local TV
stations prior to a sale starting on Saturday and ending Sunday.
She obtained the information for Saturday–Sunday digital
camera sales at the various outlets and paired it with the
number of times the advertisement was shown on the local TV
stations. The purpose is to find whether there is any relationship
between the number of times the advertisement was declared
and digital camera sales. The pairings are:
8/16/2019
a. Draw a scatter diagram.
b.
c.
d.
e.
f.
g.
0.929
5
8/16/2019
a. Draw a scatter diagram.
b.
c.
d.
e.
f.
g.
0.759
Could there be zero correlation in the population from which the sample
was selected?
8/16/2019
What Is a Hypothesis?
A statement about a population parameter subject to
verification.
Hypothesis Testing?
A procedure based on sample evidence and probability
theory to determine whether the hypothesis is a reasonable
statement.
Five-Step Procedure for Testing a
Hypothesis
8/16/2019
Testing the Significance of the Correlation Coefficient
8/16/2019
Thank You
8/16/2019