0% found this document useful (0 votes)
7 views

ADBM 103 -Introduction to Correlation - Session 4

Uploaded by

Dilshan Salgadu
Copyright
© © All Rights Reserved
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

ADBM 103 -Introduction to Correlation - Session 4

Uploaded by

Dilshan Salgadu
Copyright
© © All Rights Reserved
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 29

BUSINESS ANALYTICS - ADBM

103
Introduction to Correlation
By
Nilani Gunasekara
M.Sc. (USJP), B.Sc.(joint major) Hons.(Wayamba),
MIM(SL),CIMA Cert

8/16/2019
Descriptive Content
1. Introduction to statistics, the process and its practical application, Sampling and different
data collection procedures
2. Data analysis techniques – Central tendency measurements in decision making
3. Data analysis techniques - Dispersion measurement techniques
4. Data Visualization
5. Correlation
6. Regression analysis
7. Time series analysis – Forecasting
8. Data Mining
9. Probabilistic approach in statistics – Normal distribution
10. Design of experiments

8/16/2019
Learning Objectives

After completing this chapter you should be able to:


show diagrammatically, pairs of observations of variables,
be able to decide if there is a relationship between the
variables,
put a numerical measure on the strength of this relationship,
calculate product moment correlation coefficients,
understand the limitations of product moment correlation
coefficients.

8/16/2019
Correlation Analysis
Correlation analysis is used to measure the strength of the
relationship between two variables.
Is there a relationship between
the attendance percentage of students and their final marks?
monthly income and savings?
demand and price of a product?
advertising expenditure and sales revenue?
If two variables are related to any extent, then changes in the
value of one are related to changes in the value of the other.
Such variables are examples of bivariate data that comes in pairs
e.g. (x, y) and there may or may not be a relationship between them.
8/16/2019
Correlation Is Not Causation
"Correlation Is Not Causation" ... by that : when there is
a correlation it does not mean that one thing causes
the other. Ice-cream sales are
correlated with the level
of house burglaries
during warmer weather
when more people are on
holidays (vacation).
But ice-cream doesn’t
cause burglaries, nor vice
versa.
8/16/2019
https://
youtu.be/
taA0DWqi_
jM

8/16/2019
Scatter plot
A graphical sketch of the pairs (X, Y) is called a scatter plot
A scatter plot can be used to show the relationship between two
numerical variables
Types of Relationships
Linear Curvilinear
relationships relationships
Y Y

X X

Y Y

8/16/2019 X X
Types of Correlation
Positive No Correlation
Correlation
Y Y

X X
Negative If both variables increase together they
Correlation are said to be positively correlated.
Y
If one variable increases as the other
decreases they are said to be
negatively correlated
If no linear pattern can be seen there is
8/16/2019
X said to be no correlation.
Ex1:
In the study of a city, the population density, in people/hectare,
and the distance from the city centre, in km, was investigated by
picking a number of sample areas with the following results.

8/16/2019
8/16/2019
If a horizontal line is drawn through the mean y value,
and a vertical line through the mean x value, you can
see the relationship between the two variables in
another way.

8/16/2019
8/16/2019
The Coefficient of Correlation (Product Moment
Correlation Coefficient (r)
The correlation coefficient is sometimes referred to as the Pearson
product moment correlation coefficient in honor of its developer
Karl Pearson.
Pearson correlation or Pearson's correlation, is used to assess the
strength and direction of association between two linearly related,
quantitative variables
Pearson correlation coefficient works best when the variables are
approximately normally distributed and have no outliers (A scatter
plot can reveal these possible problems).
It can range from -1.00 to 1.00.
Negative values indicate an inverse relationship
Positive values indicate a direct relationship.

8/16/2019
The strength and direction of the
correlation

Perfect No correlation Perfect


Negative Positive
correlation correlation
Moderate Moderate
Negative Positive
correlation correlation

Strong Weak Weak Positive Strong


Negative Negative correlation Positive
correlation correlation correlation
- - 0 0. 1.
1.0 0.5 50 00
0 Negative Positive
0
correlation correlation
8/16/2019
Scatter Plots of Data with Various Correlation
Coefficients

8/16/2019
Scatter Plots of Data with Various Correlation
Coefficients

8/16/2019
Coefficient of correlation (r)

Formula for coefficient of correlation or (correlation


coefficient )

n(∑ XY ) − (∑ X )(∑ Y )
r=
2 2 2 2
[ n(∑ X ) − (∑ X ) ][n(∑ Y ) − (∑ Y ) ]

8/16/2019
Exercises
1. The following sample observations were randomly
selected.
2.

Determine the correlation coefficient and interpret the


relationship between X and Y. =
0.752
2

8/16/2019
Coefficient of correlation with
Excel
1. CORREL Function
2.

3.

The correlation coefficient between X and Y is 0.7522

8/16/2019
Analysis Toolpak add-in in Excel
1.

2.

For example, select the range


A1:B6 as the Input Range.
Check Labels in first row.
Select cell A8 as the Output Range.
Click OK.

8/16/2019
1. Bi-lo Appliance Super-Store has outlets in several large
metropolitan areas in New England. The general sales manager
declared a commercial for a digital camera on selected local TV
stations prior to a sale starting on Saturday and ending Sunday.
She obtained the information for Saturday–Sunday digital
camera sales at the various outlets and paired it with the
number of times the advertisement was shown on the local TV
stations. The purpose is to find whether there is any relationship
between the number of times the advertisement was declared
and digital camera sales. The pairings are:

8/16/2019
a. Draw a scatter diagram.
b.

c.

d.

e.

f.

g.

h. Determine the correlation coefficient. =


i. Interpret these statistical measures.

0.929
5

There is a strong positive association between the


8/16/2019 variables.
1. The sales manager of Copier Sales of America, which has a large
sales force throughout the United States and Canada, wants to
determine whether there is a relationship between the number
of sales calls made in a month and the number of copiers sold
that month. The manager selects a random sample of 10
representatives and determines the number of sales calls each
representative made. The pairings are:

8/16/2019
a. Draw a scatter diagram.
b.

c.

d.

e.

f.

g.

h. Determine the correlation coefficient. =


i. Interpret these statistical measures.

0.759

There is a strong positive association between the


8/16/2019 variables.
Testing the Significance of the Correlation Coefficient

Recall that the sales manager of Copier Sales of America found


the correlation between the number of sales calls and the
number of copiers sold was 0.759. This indicated a strong
positive association between the two variables. However, only
10 salespeople were sampled.
Could it be that the correlation in the population is actually 0?
This would mean the correlation of 0.759 was due to chance.
The population in this example is all the salespeople employed
by the firm.
8/16/2019
Testing the Significance of the Correlation Coefficient

Resolving this dilemma requires a test to answer the obvious question:

Could there be zero correlation in the population from which the sample
was selected?

To put it another way, did the computed r come from a population of


paired observations with zero correlation?

To continue our convention of allowing Greek letters to represent a


population parameter, we will let represent the correlation in the
population. It is pronounced “rho.”.

8/16/2019
What Is a Hypothesis?
A statement about a population parameter subject to
verification.

Hypothesis Testing?
A procedure based on sample evidence and probability
theory to determine whether the hypothesis is a reasonable
statement.
Five-Step Procedure for Testing a
Hypothesis

8/16/2019
Testing the Significance of the Correlation Coefficient

:  (The correlation in the population is zero.)


:  (The correlation in the population is different from zero.)
Use the 95% significance level

P- value = 0.000 < 0.05, Hence reject . Accept .


This means that the correlation in the population is
different from zero.

8/16/2019
Thank You

8/16/2019

You might also like