05 Correlation
05 Correlation
Learning Objectives
Correlation
Correlation Analysis
Analysis is
is
necessary
necessary to:
to:
•show
•show aa relationship
relationship between
between two
two
variables.
variables. This
This also
also sets
sets the
the stage
stage
for
for potential
potential cause
cause and
and effect.
effect.
IMPROVEMENT ROADMAP
Uses of Correlation Analysis
Common Uses
Phase 1:
Measurement
Characterization
•Determine and quantify the
Phase 2:
Analysis
relationship between
factors (x) and output
Breakthrough
Strategy
characteristics (Y)..
Phase 3:
Improvement
Optimization
Phase 4:
Control
KEYS TO SUCCESS
Output or y
variable
(dependent)
Correlation
Correlation
Y=
Y= f(x)
f(x)
As
As the
the input
input variable
variable changes,
changes,
there
there is
is an
an influence
influence or
or bias
bias on
on
the
the output
output variable.
variable.
Input or x variable (independent)
WHAT IS CORRELATION?
• A measurable relationship between two variable data characteristics.
Not necessarily Cause & Effect (Y=f(x))
• The input variable is called the independent variable (x or KPIV) since it is independent
of any other constraints
• The coefficient of linear correlation “r” is the measure of the strength of the
relationship.
• The square of “r” is the percent of the response (Y) which is related to the input (x).
TYPES OF CORRELATION
Positive
x x x
Negative
CALCULATING “r”
Coefficient of Linear Correlation
∑ ( )( )
•Calculate
•Calculate
s xy sample
sample covariance
covariance ((
x i − x yi − y ))
s xy =
n −1 •Calculate
•Calculate ssxx and
set
and ssyyfor
for each
each data
data
set
•Use
•Use the
the calculated
calculated values
values to
to
s xy compute
compute rrCALC .
CALC.
rCALC = •Add
•Add aa ++ for
for positive
positive correlation
correlation
sx s y and
and -- for
for aa negative
negative correlation.
correlation.
•Plot
•Plot the
the data
data on
on orthogonal
orthogonal axis
axis
•Draw
•Draw an
an Oval
Oval around
around the
the data
data
•Measure
•Measure the
the length
length and
and width
width of
of
the
the Oval
Oval
W •Calculate
•Calculate the
the coefficient
coefficient of
of linear
linear
correlation
correlation (r)
(r) based
based onon the
the
Y=f(x)
L formulas
formulas below
below
⎛ W⎞
r ≈ ±⎜1 − ⎟
x
⎝ L⎠
⎛ 6 .7 ⎞
r ≈ −⎜1 − ⎟ = − .47
⎝ 12 .6 ⎠
L + = positive slope
| | | |
W
| | | | | | | | | | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
6.7 12.6 - = negative slope
HOW DO I KNOW WHEN I HAVE CORRELATION?
Ordered r CRIT • The answer should strike a familiar cord at this point… We
Pairs have confidence (95%) that we have correlation when
5 .88 |rCALC|> rCRIT.
6 .81 •Since sample size is a key determinate of rCRIT we need to
7 .75 use a table to determine the correct rCRIT given the number
8 .71 of ordered pairs which comprise the complete data set.
9 .67 •So, in the preceding example we had 60 ordered pairs of
10 .63 data and we computed a rCALC of -.47. Using the table at
15 .51 the left we determine that the rCRIT value for 60 is .26.
20 .44 •Comparing |rCALC|> rCRIT we get .47 > .26. Therefore the
25 .40 calculated value exceeds the minimum critical value
30 .36 required for significance.
50 .28 • Conclusion: We are 95% confident that the observed
80 .22 correlation is significant.
100 .20
Learning Objectives