Chapter 9: Correlation & Simple Linear Regression
Chapter 9: Correlation & Simple Linear Regression
• Coefficient of Determination R2
• % of variability in Y (or X) that can be explained by X (or Y)
• Regression Line:
• Does the data support a linear relationship between Waist & Fat?
Linear Reg.: Model Strength
• ANOVA in linear regression model
• SST: total variation in Y; = SSR + SSE
• SSR: variation explained by linear regression
• SSE: unexplained/error/residual variation
• Least-squares estimates minimizes SSE
• Given x=x0, I have 95% confidence that the prediction interval will cover the mean
of y|x=x0
• I have 95% confidence that the next y corresponding to x0 will fall in the estimation
interval
Summary
• A regression model describes the condition distribution of Y|X=x, or
certain characteristics of it, as a function of the explanatory variables x
• We estimate such models on the basis of samples of pairs of random
variables (Y,X)
• It is convenient to assume that a regression model consists of signal and
noise, i.e. a deterministic part and an error term
Extra: Dummy Coding
• Use (k-1) dummy variables for a k-level categorical predictor
ANOVA
Source of
Variation SS df MS F P-value F crit
0.54671208 27.5462179 2.01480369
Rows 8.20068125 15 3 6 1.82574E-13 1
0.05365416 0.02682708 1.35168895 0.27411299 3.31582950
Columns 7 2 3 5 6 1
0.01984708
Error 0.5954125 30 3
8.84974791
Total 7 47