7. Chapter 14 Simple Linear Regression .
7. Chapter 14 Simple Linear Regression .
Regression terminology:
Dependent variable: variable being predicted
Independent variable: Being used to predict value
of dependent variable
Measuring Correlation
You’re used to use qualitative terms such as “positive correlation” and “negative
correlation” and “no correlation” to describe the type of correlation, and terms such as
“perfect”, “strong” and “weak” to describe the strength.
The Product Moment Correlation Coefficient is one way to quantify this:
o 1 Perfectly
correlated
No correlation 2 Partly correlated
Correlation
Properties
• r is always a number between -1 and 1.
• r > 0 indicates a positive association.
• r < 0 indicates a negative association.
• Values of r near 0 indicate a very weak linear relationship.
• The strength of the linear relationship increases as r moves away from 0 toward -1 or 1.
• The extreme values r = -1 and r = 1 occur only in the case of a perfect linear relationship.
Definition
A positive correlation exists when as one variable decreases, the other variable also
decreases and vice versa.
A negative correlation exists when as one variable decreases, the other variable
increases and vice versa.
Example
Solution
The following instructions are for the Casio ClassWiz.
6: Statistics Press MODE then select ‘Statistics’.
Data Entry
PMCC
The coefficient of determination
Solution
Lines of best fit
The correlation coefficient measures the The scattergraph method
degree of correlation between two
variables, but it does not tell us how to
predict values for one variable y given
values for the other variable x. To do
that, we need to find a line which is a
good fit for the points on a scatter
graph, and use that line to find the value
of y corresponding to each given value
of x.
Homework:
Page 608: Problems 5, 9 and 13
Page 619: Problems 19 and 21.
Simple Linear Regression Model
The equation that describes how y is
related to x and an error term is called
the regression model
26
26
Model Assumptions
Estimation and Prediction
If a significant relationship exists between x and y
and the coefficient of determination shows that
the fit is good, the estimated regression equation
should be useful for estimation and prediction.
26
Point estimators and predictors do not provide any information about the precision associated with the
estimate and/or prediction. We must develop confidence intervals and prediction intervals.
A confidence interval is an interval estimate of the mean value of y for a given value of x.
A prediction interval is used whenever we want to predict an individual value of y for a new observation
corresponding to a given value of x.