QBM101 Chapter10
QBM101 Chapter10
SSE e 2 ( y yˆ )2
SS xx x 2
n
y
2
SS yy y 2
n
SS xy xy
x y
n
SS xy
b
SS xx
a y bx
Least square/best-fit line:
x
x 386 55.1429, y y 108 15.4286
n 7 n 7
SS xy xy
x y 6403 386 108 447.5714
n 7
x
2
(386) 2
SS xx x
2
23058 1772.8571
n 7
SS xy 447.5714
b 0.2525
SS xx 1772.8571
a y bx 15.4286 (0.2525)(55.1429) 1.5050
yˆ a bx 1.5050 0.2525x
Least square/best-fit line (estimation and its
reliability):
SS xy 447.5714
b 0.2525
SS xx 1772.8571
a y bx 15.4286 (0.2525)(55.1429) 1.5050
yˆ a bx 1.5050 0.2525 x
Estimate the amount of food expenditures when the income is $6100.
yˆ a bx 1.5050 0.2525(61) $16.9075 hundred $1690.75
Error, e y yˆ 16 16.9075 $0.9075 hundred $90.75
Estimate the amount of food expenditures when the income is $6000.
yˆ a bx 1.5050 0.2525(60) $16.655 hundred $1665.50
The estimation is reliable because 60 (33,83)
Estimate the amount of food expenditures when the income is $2000.
yˆ a bx 1.5050 0.2525(20) $6.555 hundred $655.50
The estimation is not reliable because 20 (33,83) *Extrapolation
ERROR OF PREDICTION
Least square/best-fit line (interpretation of
regression coefficients):
yˆ a bx 1.5050 0.2525 x
y intercept, a 1.5050
A family with RM 0 income will
spend RM1.5050 hundred
=RM150.50 on food.
Slope coefficient, b 0.2525
For every one unit (RM100) of increment
in income, the expenditure on food will
increase by RM0.2525 hundred = RM25.25.
Degrees of Freedom for a Simple Linear
Regression Model
df = n – 2
Standard deviation of errors:
is estimated by se
SSE
se , where SSE ( y yˆ ) 2
n2
df n 2
SS yy bSS xy
se
n2
Standard deviation of errors:
SS xy 447.5714
b 0.2525
SS xx 1772.8571
SS xy xy
x y 6403 386 108 447.5714
n 7
y
2
(108) 2
SS yy y 2
1792 125.1743
n 7
SS yy bSS xy 125.1743 (0.2525)(447.5714)
se 1.5939
n2 72
Coefficient of determination (COD)
bSS xy
r
2
,0 r 1
2
SS yy
b 0.2525, SS xy 447.5714, SS yy 125.7143
bSS xy 0.2525(447.5714)
r
2
0.899 89.9%
SS yy 125.7143
Interpretation: 89.9% of the total variation in food expenditures
of household can be explained by the variation in incomes, and
the remaining 10.1% is due to randomness and other variables.
Coefficient of correlation (COC)
SS xy
r , 1 r 1
SS xx SS yy
SS xx 1772.8571, SS xy 447.5714, SS yy 125.7143
SS xy 447.5714
r 0.9481
SS xx SS yy 1772.8571125.7143
Interpretation: Positive or negative sign/correlated.
Very weak, average/moderate, strong, very strong
r 0.9481: very strong and positively correlated
Other example:
r 0.1111: very weak and negatively correlated
bB se
Test statistic: tcalc , df n 2, sb
sb SS xx
H0 : B 0
H1 : B 0 (two-tailed test)
B 0 (positive), B 0 (negative) (one-tailed test)
is unknown, use the t distribution.
HT about the slope coefficient, B
Test at the 1% significance level whether the
slope of the regression line is positive.
H 0 : B 0, H1 : B 0 (one-tailed test)
0.01
df n 2 7 2 5
b B 0.2525 0
tcalc 6.662
sb 0.0379
tcritical t ,n 2 t0.01,5 3.365
tcritical 3.365 tcalc 6.662
Reject H 0 . There is sufficient evidence to conclude
that the slope is positive, or, income determines
food expenditure positively.
A random sample of eight drivers selected from a small city
insured with a company and having similar minimum
required auto insurance policies was selected. The following
table lists their driving experiences (in years) and monthly
auto insurance premiums (in dollars).
Regression Analysis: A Complete Example
(b) x
x 90
11.25, y
y 474
59.25
n 8 n 8
SS xy xy
x y
4739
(90)(474)
593.5
n 8
x
2
(90) 2
SS xx x 2
1396 383.5
n 8
y
2
(474) 2
SS yy y 2 29, 642 1557.5
n 8
SS xy 593.5
(c) b 1.5476
SS xx 383.5
a y bx 59.25 (1.5476)(11.25) 76.6605
yˆ a bx 76.6605 1.5476 x
Regression Analysis: A Complete Example
(d) yˆ a bx 76.6605 1.5476 x
y intercept, a 76.6605
A driver with 0 years of driving experience will need to pay
a monthly premium of $76.66.
Slope coefficient, b 1.5476
For every one extra year of driving experience, the monthyly
premium will decrease by $1.55.
SS xy 593.5
(e) COC, r 0.7679
SS xx SS yy (383.5)(1557.5)
A moderately strong and negatively correlation.
bSS xy (1.5476)(593.5)
r
2
0.5897
SS yy 1557.5
Alternative: COD,r 2 0.7679 0.5897
2
Source: http://www.excel-easy.com/examples/regression.html
EXCEL
EXCEL
EXCEL
SUMMARY
Identify IV (x) and DV (y)
Calculate SS of xx, yy, and xy