0% found this document useful (0 votes)
10 views

Regression2024 MBA

Uploaded by

alsalahia005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Regression2024 MBA

Uploaded by

alsalahia005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CHAPTER 13

1
CHAPTER OUTLINE

2
12.1 SIMPLE LINEAR REGRESSION MODEL (SLR)

3
12.1 SIMPLE LINEAR REGRESSION MODEL (SLR)

4
12.2 SIMPLE LINEAR REGRESSION ANALYSIS

Deterministic (mathematical model)

Probabilistic (statistical model)

5
12.2 SIMPLE LINEAR REGRESSION ANALYSIS

y = A + Bx + ε

• A & B are called population parameter

ˆ  a  bx  e
y

• a & b are called estimated value for (A & B)

6
12.2 SIMPLE LINEAR REGRESSION ANALYSIS

7
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
• Assumptions of the Regression Model
Like any other theory, the LRA is also based on certain assumptions.

Assumption 1: Error term (ε) has = 0 &  = 1

Assumption 2: ε is independent

Assumption 3: ε is normally distributed

Assumption 4: ε has constant 

8
12.2 SIMPLE LINEAR REGRESSION ANALYSIS
• A Note on the Use of Simple Linear Regression
• When we use SLR, we assume that the relationship between two variables
is described by a straight line. In the real world, this assumption might not
be true, so you need to construct a scatter diagram and look at the plot of
the data points. If it is linear relation then use SLR other wise try suitable
regression model.

Linear correlation Non- Linear correlation

9
12.3 Standard deviation of errors

• The standard deviation of errors tells us how widely the errors and, hence,
the values of y are spread for a given x.

Note that the standard deviation of errors for the population usually is unknown. In
such cases, it is estimated by se, which is the standard deviation of errors for the
sample data. The following is the basic formula to calculate

10
12.4 The coefficient of determination

• The coefficient of determination (COD) is a one concept that tells you how
good is your regression model.
• The ratio of SSR to SST gives the coefficient of determination (R2). The R2
calculated for population data is denoted by 2 ( is the Greek letter
“rho”), and the one calculated for sample data is denoted by r2 . Also you
get COD (r2) through squaring the value of the relation coefficient (r).

11
0 ≤ R2 ≤ 1
12.5 inferences About B

• Here we mean by inferences is to construct confidence intervals and test


hypotheses about the y-intercept () of the population regression line.
However, making inferences about (a) is beyond the scope of this text.
• The slope b of a sample regression line is a point estimator of the slope 
of the population regression line. The different sample regression lines
estimated for different samples taken from the same population will give
different values of b. If only one sample is taken and the regression line for
that sample is estimated, the value of b will depend on which elements are
included in the sample. Thus, b is a random variable, and it possesses a
probability distribution that is more commonly called its sampling
distribution.

12
12.5 inferences About B

• Estimation of B

13
12.5 inferences About B

• Hypothesis Testing About B


• Testing a hypothesis about  when the null hypothesis is  = 0 (that is, the
slope of the regression line is zero) is equivalent to testing that x does not
determine y and that the regression b line is of no use in predicting y for a
given x.
• It is possible that x may determine y nonlinearly. Hence, a nonlinear
relationship may exist between x and y. To test the hypothesis that x does
not determine y linearly, we will test the null hypothesis that the slope of
the regression line is zero; that is
• H0 :  = 0
• H1 :  ≠ 0 OR H1 :  > 0 OR H1 :  < 0
• The procedure used to make a hypothesis test about B is similar to the one
used in earlier chapters. It involves the same five steps.
• Note the DF in regression inferences = n - 2

14
12.5 inferences About B

15
13.6 Linear Correlation

• Another measure of the relationship between two variables is the


correlation coefficient. This section describes the simple linear correlation,
for short linear correlation, which measures the strength of the linear
association between two variables.
• The correlation coefficient calculated for the population data is denoted
by  (Greek letter rho) and the one calculated for sample data is denoted
by r. (Note that the square of the correlation coefficient is equal to the
coefficient of determination.)

16
13.6 Linear Correlation

17
13.6 Linear Correlation

18
13.6 Linear Correlation

• Hypothesis Testing About the Linear Correlation Coefficient


• This section describes how to perform a test of hypothesis about the
population correlation coefficient  using the sample correlation
coefficient r. We can use the t distribution to make this test. However, to
use the t distribution, both variables should be normally distributed.
• Usually (although not always), the null hypothesis is that the linear
correlation coefficient between the two variables is zero, that is:
• H0 :  = 0
• H1 :  ≠ 0 OR H1 :  > 0 OR H1 :  < 0

19
Page

20
Comprehensive example
A diabetic is interested in determining how the amount of aerobic exercise
impacts his blood sugar. When his blood sugar reaches 170 mg/dL, he goes out for
a run at a pace of 10 minutes per mile. On different days, he runs different
distances and measures his blood sugar after completing his run. Note: The
preferred blood sugar level is in the range of 80 to 120 mg/dL. Levels that are too
low or too high are extremely dangerous. The data generated are given in the
following table.
Distance 75 83 94 85 95 104 116 120 125 131 146 136
Blood
4.5 4.5 4 4 3.5 3.5 3 3 2.5 2.5 2 2
Sugar(mg/dl)

a. Identify the dependent & independent variable. b. Construct a scatter diagram for the
data. c. Find the predictive regression equation of data, and give a brief interpretation of
the values of ‘a’ and ‘b’. d. Plot the predictive regression line on the scatter diagram. e.
calculate the SD of the error. f. Calculate the predicted blood sugar level after a run of 3.1
and 10 miles. Comment on this finding. g. Calculate the value of (r) and (r2) then
comment on these values. h. at 5% sig test if the sign of (r) is negative. i. At 10% sig test
whether the value of (b) is different from zero. j. construct 90% CL for the value of (b).

21
12.2 SIMPLE LINEAR REGRESSION ANALYSIS

22
S y x xy x2 y2
1 75 4.5 337.5 20.25 5625
2 83 4.5 373.5 20.25 6889
3 94 4 376 16 8836
4 85 4 340 16 7225
5 95 3.5 332.5 12.25 9025
6 104 3.5 364 12.25 10816
7 116 3 348 9 13456
8 120 3 360 9 14400
9 125 2.5 312.5 6.25 15625
10 131 2.5 327.5 6.25 17161
11 146 2 292 4 21316
12 136 2 272 4 18496
Total 1310 39 4035.5 135.5 148870
109.1667 3.25
𝑦ത 𝑥ҧ
𝛴𝑥 𝛴𝑦 𝛴𝑥 2
𝑆𝑆𝑥𝑦 = 𝛴𝑥𝑦 − 𝑆𝑆𝑥𝑥 = 𝛴𝑥 2 − 𝛴𝑦 2
𝑛 𝑛 𝑆𝑆𝑦𝑦 = 𝛴𝑦 2 −
39 ∗1310 1521 𝑛
= 5.4035 -
12
= −𝟐𝟐𝟐 = 135.5 - = 𝟖. 𝟕𝟓 =148870 - 143008 = 5862
23 12
𝑺𝑺𝒙𝒚 = − 222 𝑺𝑺𝒙𝒙 = 8.75 𝑺𝑺𝒚𝒚 = 5862
𝑆𝑆𝑥𝑦 −222
𝑏= = = − 25.4 𝑎 = 𝑦ത − 𝑏𝑥ҧ = 109.2 + 25.4 * 3.3 = 193
𝑆𝑆𝑥𝑥 8.75

𝑠𝑠𝑦𝑦 −𝑏𝑠𝑠𝑥𝑦 5862 −(−25.4 ∗−222)


𝑆𝑒 = = 10 = 22.3 = 4.7
𝑛−2

𝑆𝑆𝑥𝑦 −222 𝑏𝑆𝑆𝑥𝑦 − 25.4 ∗− 222


𝑟= = = − 0.98 𝑟2 = = = 0.96
𝑆𝑆𝑥𝑥 𝑆𝑆𝑦𝑦 8.75 ∗5862 𝑆𝑆𝑦𝑦 5862

𝑛−2 10 10
𝑡=𝑟 = − 0.98 = − 0.98 = − 15.5
1−𝑟 2 1−0.96 0.04

𝑠𝑒 4.7 𝑏−𝐵 − 25.4


𝑆𝑏 = = = 1.59 𝑡= = = 15.97
𝑆𝑆𝑥𝑥 8.75 𝑠𝑏 1.59

𝑏 ± 𝑡𝑠𝑏 = − 25.4 − 1.59 * 1.812 = -28.281


= − 25.4 + 1.59 * 1.812 = - 22.519
x y ഥ
𝒚 ෝ
𝒚 ෝ
𝒚−𝒚 ෝ−𝒚
𝒚 ഥ ഥ )2
𝒚−𝒚
(ෝ ෝ )2
(𝒚 − 𝒚
2 136 109 142.2 6.2- 33.2 1102.24 38.44
2 146 109 142.2 3.8 33.2 1102.24 14.44
2.5 131 109 129.5 1.5 20.5 420.25 2.25
2.5 125 109 129.5 4.5- 20.5 420.25 20.25
3 120 109 116.8 3.2 7.8 60.84 10.24
3 116 109 116.8 0.8- 7.8 60.84 0.64
3.5 104 109 104.1 0.1- 4.9- 24.01 0.01
3.5 95 109 104.1 9.1- 4.9- 24.01 82.81
4 85 109 91.4 6.4- 17.6- 309.76 40.96
4 94 109 91.4 2.6 17.6- 309.76 6.76
4.5 83 109 78.7 4.3 30.3- 918.09 18.49
4.5 75 109 78.7 3.7- 30.3- 918.09 13.69
5670.38 248.98
SSR SSE
SST 5919.36

25

You might also like