Chap 12
Chap 12
Chapter 12
Simple Regression
Chapter Goals
where
5
Hypothesis Test for Correlation
6
Decision Rules
Hypothesis Test for Correlation
α α α/2 α/2
8
Linear Regression Model
9
Simple Linear Regression
Model
The population regression model:
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable
Linear Random
component Error
Statistics for Business and component
Economics, 6e © 2007 Pearson
Education, Inc. 10
Simple Linear Regression
Model
(continued)
Y
Observed Value
of Y for Xi
εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value
Intercept = β0
12
Least Squares Estimators
14
Finding the Least Squares
Equation
▪ The random error terms, εi, are not correlated with one
another, so that
16
Interpretation of the
Slope and the Intercept
17
Simple Linear Regression
Example
. 18
Sample Data for House Price
Model
House Price in $1000s Square Feet
(Y) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 and
Statistics for Business 1700
Economics, 6e © 2007 Pearson
Education, Inc. 19
Graphical Presentation
Multiple R 0.76211
The regression equation is:
R Square 0.58082
Observations 10
ANOVA
df SS MS F Significance F
Total 9 32600.5000
Slope
= 0.10977
Intercept
= 98.248
where:
= Average value of the dependent variable
Statistics for Business
yi = and
Observed values of the dependent variable
Economics, 6e © 2007 Pearson
i = Predicted value of y for the given x i value
Education, Inc. 26
Measures of Variation
(continued)
note:
29
Examples of Approximate
r2 Values
Y
r2 = 1
r2 = 0
Y
No linear relationship
between X and Y:
Multiple R 0.76211
R Square 0.58082
58.08% of the variation in
Adjusted R Square 0.52842 house prices is explained by
Standard Error 41.33032 variation in square feet
Observations 10
ANOVA
df SS MS F Significance F
Total 9 32600.5000
35
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Observations 10
ANOVA
df SS MS F Significance F
Total 9 32600.5000
X X
where:
= Estimate of the standard error of the least squares slope
38
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Observations 10
ANOVA
df SS MS F Significance F
Total 9 32600.5000
Y Y
X X
d.f. = n - 2
▪ F Test statistic:
where
Multiple R 0.76211
R Square 0.58082
ANOVA
df SS MS F Significance F
Total 9 32600.5000
Education,
Square Feet Inc. 0.10977 0.03297 3.32938 0.01039 49
0.03374 0.18580
F-Test for Significance
(continued)
Risky to try to
extrapolate far
Statistics for Business and beyond the range
Economics, 6e © 2007 Pearson of observed X’s
Education, Inc. 53
Estimating Mean Values and
Predicting Individual Values
Goal: Form intervals around y to express
uncertainty about the value of y for a given xi
Confidence
Interval for
the expected
Y ∧
y
value of y,
given xi
∧
y = b0+b1xi
Prediction Interval
Statistics
for an singlefor Business and
Economics, 6e ©
observed y, given xi 2007 Pearson
Education, Inc.
xi 54 X
Confidence Interval for
the Average Y, Given X
Confidence interval estimate for the
expected value of y given a particular xi
Statistics forsoBusiness
the size of interval varies according to the distance
and
Economics,x6e n+1 is from the mean, x
© 2007 Pearson
Education, Inc. 55
Prediction Interval for
an Individual Y, Given X
Confidence interval estimate for an actual
observed value of y given a particular xi
▪ In Excel, use
PHStat | regression | simple linear regression …
▪ Check the
“confidence and prediction interval for x=”
box and enter the x-value and confidence level
desired
Input values
∧
y