Probabilities... (A.K.A. Chance) Definitions
Probabilities... (A.K.A. Chance) Definitions
The closer the correlation value is to -1 or 1, the tighter (more linear) the relationship will be on a scatter plot
(see below on Pearson’s coefficient)
Caution Hazard
Perfect High Low
Positive Positive Positve No
Correlation Correlation Correlation Correlation Correlation DOES NOT prove
Causation!
Beware temptation to say that a correlation
between two things means one causes the other.
For example…
There may be a correlation between sweater and
1 0.9 0.5 0 snow-shovel sales. However, that does not mean
that sweaters make people buy snow-shovels.
Low High Perfect
Negative Negative Negative All that we can say with a correlation is that there is
Correlation Correlation Correlation a relationship/link between sweater sales and
snow-shovel sales.
There are different correlation calculations (called In this example we have two things to compare, X Probabilities ARE NOT Guarantees!
coefficients) for different kinds of data: and Y.
Probability tells you that over the long run there is
Pearson’s Coefficient – Measures linear 1. First calculate Mean (average) of X
a certain chance of something happening. Not that
relationship between two variables 2. Calculate Mean (average) of Y
something will or will not happen at a specific time.
3. Subtract Mean of X from each of X values (we’ll
Spearman’s Rank Coefficient – Measures
call these A), and subtract Mean of Y from each
relationship between two ordinal variables In other words, probabilities are great for general
of Y values (we’ll call these B)
Phi Coefficient – Measures relationship predictions about long term events, but they
4. Square A’s (we’ll call these C2’s)
between two dichotomous variables cannot and do not predict specific events.
5. Square B’s (we’ll call these D2’s)
6. Multiply all A’s by B’s (we’ll call these AB’s)
Pearson’s Coefficient is most popular and what 7. Add up all AB’s Over Generalization of Results
analytical tools use. 8. Add up all C2’s
9. Add up all D2’s If you calculate a correlation on a specific
10. Now perform calculation below… population, you cannot then say that correlation is
same for general population.
Sum of all AB’s
Correlation =
(Sum of C2’s) x (Sum of D2’s)
Make sure to review Hazards! section regarding correlation and causation! Check out our Statistics Cheat Sheet