Regression2024 MBA
Regression2024 MBA
1
CHAPTER OUTLINE
2
12.1 SIMPLE LINEAR REGRESSION MODEL (SLR)
3
12.1 SIMPLE LINEAR REGRESSION MODEL (SLR)
4
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
5
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
y = A + Bx + ε
ˆ a bx e
y
6
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
7
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
• Assumptions of the Regression Model
Like any other theory, the LRA is also based on certain assumptions.
Assumption 2: ε is independent
8
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
• A Note on the Use of Simple Linear Regression
• When we use SLR, we assume that the relationship between two variables
is described by a straight line. In the real world, this assumption might not
be true, so you need to construct a scatter diagram and look at the plot of
the data points. If it is linear relation then use SLR other wise try suitable
regression model.
9
12.3 Standard deviation of errors
• The standard deviation of errors tells us how widely the errors and, hence,
the values of y are spread for a given x.
Note that the standard deviation of errors for the population usually is unknown. In
such cases, it is estimated by se, which is the standard deviation of errors for the
sample data. The following is the basic formula to calculate
10
12.4 The coefficient of determination
• The coefficient of determination (COD) is a one concept that tells you how
good is your regression model.
• The ratio of SSR to SST gives the coefficient of determination (R2). The R2
calculated for population data is denoted by 2 ( is the Greek letter
“rho”), and the one calculated for sample data is denoted by r2 . Also you
get COD (r2) through squaring the value of the relation coefficient (r).
11
0 ≤ R2 ≤ 1
12.5 inferences About B
12
12.5 inferences About B
• Estimation of B
13
12.5 inferences About B
14
12.5 inferences About B
15
13.6 Linear Correlation
16
13.6 Linear Correlation
17
13.6 Linear Correlation
18
13.6 Linear Correlation
19
Page
20
Comprehensive example
A diabetic is interested in determining how the amount of aerobic exercise
impacts his blood sugar. When his blood sugar reaches 170 mg/dL, he goes out for
a run at a pace of 10 minutes per mile. On different days, he runs different
distances and measures his blood sugar after completing his run. Note: The
preferred blood sugar level is in the range of 80 to 120 mg/dL. Levels that are too
low or too high are extremely dangerous. The data generated are given in the
following table.
Distance 75 83 94 85 95 104 116 120 125 131 146 136
Blood
4.5 4.5 4 4 3.5 3.5 3 3 2.5 2.5 2 2
Sugar(mg/dl)
a. Identify the dependent & independent variable. b. Construct a scatter diagram for the
data. c. Find the predictive regression equation of data, and give a brief interpretation of
the values of ‘a’ and ‘b’. d. Plot the predictive regression line on the scatter diagram. e.
calculate the SD of the error. f. Calculate the predicted blood sugar level after a run of 3.1
and 10 miles. Comment on this finding. g. Calculate the value of (r) and (r2) then
comment on these values. h. at 5% sig test if the sign of (r) is negative. i. At 10% sig test
whether the value of (b) is different from zero. j. construct 90% CL for the value of (b).
21
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
22
S y x xy x2 y2
1 75 4.5 337.5 20.25 5625
2 83 4.5 373.5 20.25 6889
3 94 4 376 16 8836
4 85 4 340 16 7225
5 95 3.5 332.5 12.25 9025
6 104 3.5 364 12.25 10816
7 116 3 348 9 13456
8 120 3 360 9 14400
9 125 2.5 312.5 6.25 15625
10 131 2.5 327.5 6.25 17161
11 146 2 292 4 21316
12 136 2 272 4 18496
Total 1310 39 4035.5 135.5 148870
109.1667 3.25
𝑦ത 𝑥ҧ
𝛴𝑥 𝛴𝑦 𝛴𝑥 2
𝑆𝑆𝑥𝑦 = 𝛴𝑥𝑦 − 𝑆𝑆𝑥𝑥 = 𝛴𝑥 2 − 𝛴𝑦 2
𝑛 𝑛 𝑆𝑆𝑦𝑦 = 𝛴𝑦 2 −
39 ∗1310 1521 𝑛
= 5.4035 -
12
= −𝟐𝟐𝟐 = 135.5 - = 𝟖. 𝟕𝟓 =148870 - 143008 = 5862
23 12
𝑺𝑺𝒙𝒚 = − 222 𝑺𝑺𝒙𝒙 = 8.75 𝑺𝑺𝒚𝒚 = 5862
𝑆𝑆𝑥𝑦 −222
𝑏= = = − 25.4 𝑎 = 𝑦ത − 𝑏𝑥ҧ = 109.2 + 25.4 * 3.3 = 193
𝑆𝑆𝑥𝑥 8.75
𝑛−2 10 10
𝑡=𝑟 = − 0.98 = − 0.98 = − 15.5
1−𝑟 2 1−0.96 0.04
25