Data Management - Part 3
Data Management - Part 3
(Part 3)
Mathematics in the Modern World
Mathematics Area
De La Salle Lipa
Contents
1) Correlation Analysis
• Constructing Scatter Plot
• Describing Relationships Using the Pearson Product-
Moment Correlation Coefficient
• Testing the Significance of the Pearson Product-Moment
Correlation Coefficient r
2) Linear Regression
Correlation Analysis
Correlation Analysis
is the process or procedure of describing the
relationship between two variables
Notice that the points on the scatter plot do not lie on one line. However,
the points closely follow a straight line. This line is called a trend line.
• To describe the relationship between two variables, we can compute the correlation
coefficient (r). This is another and a more accurate way of determining the kind of
relationship that exists between two variables
• r is a number between -1 and 1 that describes both the strength and the direction of
correlation.
Describing Relationships Using the
Pearson Product-Moment Correlation Coefficient
The value of
correlation coefficient
is usually expressed in
4 decimal places, but
for the sake of
interpretation, we
round it off to 2
decimal places.
Describing Relationships Using the
Pearson Product-Moment Correlation Coefficient
Example:
Value of r Interpretation
where:
x = value of variable x
y = value of variable y
n = number of sample
points/observations
Pearson Product-Moment Correlation Coefficient
Example
A store manager wishes to find out whether there is a relationship
between the age of the employees and the number of sick days they incur
each year. Calculate the correlation coefficient (r) and describe the
relationship in terms of strength and direction.
Pearson Product-Moment Correlation Coefficient
Solution:
A store manager wishes to find out whether there is a relationship between the age of the employees and the number of sick
days they incur each year. Calculate the correlation coefficient (r) and describe the relationship in terms of strength and
direction.
24 21
6 3
For the sample data, there is a negligible correlation
16 6
between the number of cases of soft drinks ordered and
64 15
the travel time they are delivered.
10 21
25 61
35 20
Testing the Significance
Example
A soft drink distributor is interested to find out if the number of cases of soft drinks ordered is
related to the travel time they are delivered. The following data have been obtained from past
experiences. Test the significance of the correlation coefficient at 0.05 level of significance.
NO. OF Cases of Soft Travel Time in 2. Testing the Significance (Use the 5 steps in Hypothesis Testing)
Drinks (X) Minutes (Y)
Step 2) Use t-test to test the significance of r. Identify the critical values (CV).
24 21
Level of significance =0.05; 2-tailed,
6 3
df= n – 2 = 7-2=5
16 6 Tabular Values or CV= +-2.571
64 15
Note that we are using t-test in
10 21
testing the significance of r thus, we
25 61 shall use the t-table in identifying
35 20 the critical values.
Testing the Significance
Example
A soft drink distributor is interested to find out if the number of cases of soft drinks ordered is
related to the travel time they are delivered. The following data have been obtained from past
experiences. Test the significance of the correlation coefficient at 0.05 level of significance.
NO. OF Cases of Soft Travel Time in 2. Testing the Significance (Use the 5 steps in Hypothesis Testing)
Drinks (X) Minutes (Y)