Simple Linear Reg Ex 1
Simple Linear Reg Ex 1
SimpleLinearRegEx1-1
Sample Data for House Price Model
House Price in $1000s Square Feet
(Y) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
SimpleLinearRegEx1-2
Graphical Presentation
• House price model: scatter plot
SimpleLinearRegEx1-3
Regression Using Excel
• Tools / Data Analysis / Regression
SimpleLinearRegEx1-4
Excel Output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
SimpleLinearRegEx1-5
Graphical Presentation
• House price model: scatter plot and regression
line
Slope
= 0.10977
Intercept
= 98.248
SimpleLinearRegEx1-6
Interpretation of the Intercept, b0
SimpleLinearRegEx1-7
Interpretation of the Slope Coefficient, b1
SimpleLinearRegEx1-8
Measures of Variation
where:
= Average value of the dependent variable
yi = Observed values of the dependent variable
i
= Predicted value of y for the given xi value
SimpleLinearRegEx1-9
Measures of Variation
(continued)
SimpleLinearRegEx1-10
2
Coefficient of Determination, R
• The coefficient of determination is the portion of the total
variation in the dependent variable that is explained by
variation in the independent variable
• The coefficient of determination is also called R-squared and
is denoted as R2
note:
SimpleLinearRegEx1-11
2
Examples of Approximate r Values
Y
r2 = 1
2 X
r =1
SimpleLinearRegEx1-12
Examples of Approximate
2
r Values
Y
0 < r2 < 1
X
SimpleLinearRegEx1-13
Examples of Approximate
2
r Values
r2 = 0
Y
No linear relationship
between X and Y:
SimpleLinearRegEx1-14
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842 58.08% of the variation in
Standard Error 41.33032 house prices is explained by
Observations 10
variation in square feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
SimpleLinearRegEx1-15
Correlation and
• The coefficient of determination, R2, for a simple
regression is equal to the simple correlation
squared
SimpleLinearRegEx1-16
Estimation of Model Error Variance
• Division by n – 2 instead of n – 1 is because the simple regression model uses two estimated
parameters, b0 and b1, instead of one
SimpleLinearRegEx1-17
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
SimpleLinearRegEx1-18
Comparing Standard Errors
se is a measure of the variation of observed y
values from the regression line
Y Y
X X
i.e., se = $41.33K is moderately small relative to house prices in the $200 - $300K range
SimpleLinearRegEx1-19
Inferences About the Regression Model
where:
= Estimate of the standard error of the least squares slope
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
SimpleLinearRegEx1-21
Inference about the Slope: t Test
• t test for a population slope
• Is there a linear relationship between X and Y?
• Null and alternative hypotheses
H0 : β1 = 0 (no linear relationship)
H1 : β1 ≠ 0 (linear relationship does exist)
• Under H0, the test statistic
where:
b1 = regression slope
coefficient
β1 = hypothesized slope
sb1 = standard
error of the slope
SimpleLinearRegEx1-22
Inference about the Slope: t Test
(continued)
SimpleLinearRegEx1-23
Inferences about the Slope: t Test Example
SimpleLinearRegEx1-24
Inferences about the Slope: t Test Example
(continued)
Test Statistic: t = 3.329
H0 : β 1 = 0 From Excel output: b1 t
H1 : β 1 ≠ 0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
d.f. = 10-2 = 8 Square Feet 0.10977 0.03297 3.32938 0.01039
t8,.025 = 2.3060
Decision:
α/2=.025 α/2=.025 Reject H0
Conclusion:
d.f. = n - 2
SimpleLinearRegEx1-28
F-Test for Significance
• F Test statistic:
where
SimpleLinearRegEx1-30
F-Test for Significance
(continued)
H0 : β 1 = 0 Test Statistic:
H1 : β 1 ≠ 0
α = .05
df1= 1 df2 = 8 Decision:
Critical Reject H0 at α = 0.05
Value:
Fα = 5.32
Conclusion:
α = .05
There is sufficient evidence that
0 F house size affects selling price
Do not Reject H0
reject H0
F.05 = 5.32
SimpleLinearRegEx1-31
Predictions Using Regression
Analysis
Predict the price for a house
with 2000 square feet:
SimpleLinearRegEx1-33
Summary
• Introduced the linear regression model
• Reviewed correlation and the assumptions of linear
regression
• Discussed estimating the simple linear regression
coefficients
• Described measures of variation
• Described inference about the slope
• Addressed estimation of mean values and prediction
of individual values
SimpleLinearRegEx1-34