0% found this document useful (0 votes)

12 views

11 SimpleRegression

This document provides an introduction to simple linear regression. It presents a model for simple linear regression using an intercept (β0) and slope (β1) coefficient. It provides an example using advertising expenditures and sales data from a teddy bear company. Scatterplots and least squares estimators are used to estimate the intercept and slope coefficients. The coefficients are then interpreted. Measures of variation like R-squared, standard error, and inferences about the slope coefficient using t-tests are also discussed. Finally, the document touches on prediction using the regression model.

Uploaded by

Jawwad Ahmed

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

11 SimpleRegression

Uploaded by

Jawwad Ahmed

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

DISC 203 – PROBABILITY &

STATISTICS

SIMPLE LINEAR REGRESSION

1 Lecturer: Muhammad Asim

SIMPLE LINEAR REGRESSION MODEL

Slope Random
Intercept Independent Error term
Coefficient
Variable
Dependent
Variable

Yi  β0  β1Xi  ε i
Deterministic component Random Error
component
EXAMPLE
 You are a marketing analyst for Teddy Bears. You gather the
following data and want to find a simple relationship between
advertising and sales.

Advertising - Sales Data

Month Advertising Sales
Expenditure x Revenue y
($100) ($1,000)
1 1 1
2 2 1
3 3 2
4 4 2 3

5 5 4
SCATTERGRAM
SALES VS. ADVERTISING

Sales
4
3
2
1
0
0 1 2 3 4 5
Advertising
4
SCATTERGRAM
SALES VS. ADVERTISING

5
LEAST SQUARES ESTIMATORS
 Prediction equation
yˆi  ˆ0  ˆ1 xi
 Sample slope
SS xy  xi  x  yi  y 
ˆ1  
2
SS xx  i x  x 
 Sample y - intercept  y  y 
SS xy   x  x

  x  x 
2
SS xx
ˆ0  y  ˆ1x SS yy   y  y 
2

6
COMPUTATIONS – LEAST SQUARES LINE
xi (adv) yi (sales) (xi - 3)2 (xi - 3)(yi - 2)
1 1 4 2
2 1 1 1
3 2 0 0
4 2 1 0
5 4 4 4
∑xi = 15 ∑yi = 10 SSxx= ∑(xi - 3) 2 SSxy=∑(xi - 3)(yi - 2)
Mean = 3 Mean = 2 = 10 =7

7
COEFFICIENT INTERPRETATIONS

^
1. Slope (1)
• Sales Volume (y) is expected to increase by $ 700
for each $100 increase in advertising (x), over the
sampled range of advertising expenditures from
$100 to $500
^
2. y-Intercept (0)
• Since 0 is outside of the range of the sampled
values of x, the y-intercept has no meaningful
interpretation 8
MEASURES OF VARIATION

 SST = total sum of squares

 Measures the variation of the y i values around their mean, y
 SSR = regression sum of squares
 Explained variation attributable to the linear relationship between x
and y
 SSE = error sum of squares
 Variationattributable to factors other than the linear relationship
between x and y
MEASURES OF VARIATION

 Total variation is made up of two parts:

SST  SSR  SSE

Total Sum of Regression Sum Error Sum of
Squares of Squares Squares

SST   (y i  y)2 SSR   (yˆ i  y)2 SSE   (y i  yˆ i )2

where:
y = Average value of the dependent variable
yi = Observed values of the dependent variable
ŷ = Predicted value of y for the given x value
i i
Advertising Sales
yhat=b0+b1*x (y-yhat)^2 Expenditure x Revenue y
($100) ($1,000)
(y-yhat)

n=5

11
COEFFICIENT OF DETERMINATION, R2

 The coefficient of determination is the portion of

the total variation in the dependent variable that is
explained by variation in the independent variable
 The coefficient of determination is also called R-
squared and is denoted as R2

SSR regression sum of squares

R 
2

SST total sum of squares

note: 0 R 1
2
R2 INTERPRETATION

R2 = SSR/SST = 0.82

Interpretation: About 82% of the sample variation in

Sales can be explained by Advertising Expenditures,
using the linear regression model.

13
STANDARD ERROR OF THE
REGRESSION MODEL

SSE SSE
s 
2

Degrees of freedom for error n  2

 We refer to s as the standard error of the regression

model
 s measures the spread of the distribution of y values
about the least squares line
 We expect most of the observed y-values to lie within 2s
of their respective least squares predicted values
CALCULATING S2 AND S

SSE 1.1
s 
2
  .36667
n2 52

s  .36667  .6055
We would expect most of the observed revenues to
fall within 2s or $1,220 of the least squares line.

15
Yi  β0  β1Xi  ε i

16
MAKING INFERENCES ABOUT SLOPE
 E(y) = 0 + 1x
 Ho: 1 = 0
 Ha: 1 ≠ 0
 If 1 = 0, then x has no influence on y.
 If we reject Ho, we say that x has a statistically significant
effect on y.
 To test the null, we need to know the sampling distribution of
̂1

17
MAKING INFERENCES ABOUT THE
SLOPE 1
 Sampling Distribution of ̂1 large n:
for


̂1 ~ N ( 1 ,  ˆ  )
1
SS xx

 Typically approximate
 s
 ˆ  sˆ 
1
SS xx by 1
SS xx

 So, when n is large, we use a z-statistic ~ N(0,1)

 When n is small, we typically use a t-statistic ~ t( n-2 )
 For large n, the distributions of z and t statistics are almost the same
18
MAKING INFERENCES ABOUT THE
SLOPE 1
A Test of Model Utility: Simple Linear
Regression
One-Tailed Test Two-Tailed Test
H0: β1=0 H0: β1=0
Ha: β1<0 (or Ha: β1>0) Ha: β1≠0 s  2 SSE

SSE
Degrees of freedom for error n  2
ˆ1 ˆ1
Test statistic :t  
sˆ s SS xx
1

Rejection region: t< -tα Rejection region: |t|> tα/2

(or t< -tα when Ha: β1>0) 19

Where tα and tα/2 are based on (n-2) degrees of freedom

EXAMPLE
 We estimated a simple relationship between advertising
and sales based on a sample of 5 observations. Is the true
relationship statistically significant at the .05 level of
significance?

20
TEST OF SLOPE COEFFICIENT
SOLUTION

 H0: 1 = 0
 Ha: 1  0

   .05

 df 5–2=3
 Critical Value(s):
Reject H0 Reject H0
.025 .025

-3.182 0 3.182 t
21
TEST OF SLOPE COEFFICIENT
SOLUTION

 H0: 1 = 0 Test Statistic:

 Ha: 1  0

   .05

 df 5–2=3
 Critical Value(s): Decision:
Reject H0 Reject H0 Reject Ho at  = .05
.025 .025 Conclusion:

There is evidence of a
-3.182 0 3.182 t relationship
22
MAKING INFERENCES ABOUT THE
SLOPE 1
 Confidence Interval for 1 : ˆ1  t 2 sˆ
1

 [0.090, 1.309]

 We can be 95% confident that the true mean increase in

monthly sales revenue per additional $100 of advertising
expenditure is between $90 and $1,310.

23
PREDICTION WITH REGRESSION
MODELS
 Types of predictions
 Point estimates
 Interval estimates
 What is predicted
 Population mean response E(y) for given x
 Point on population regression line
 Individual response (yi) for given x

24
WHAT IS PREDICTED

y
yIndividual ^
b x
^
= b0
+ 1
^y i
Mean y, E(y)

E(y) = b0 + b1x

Prediction, ^
y
x
xP 25
USING THE MODEL FOR
ESTIMATION AND PREDICTION
 100(1-α)% Confidence Interval for Mean Value of y at x=xp

1  xp  x 
2

yˆ  t 2 s 
n SS xx
 100(1-α)% Prediction Interval for an Individual New Value of y at
x=xp
1  xp  x 
2

yˆ  t 2 s 1  
n SS xx
26
 where tα/2 is based on (n-2) degrees of freedom
EXAMPLE
 Find a 95% confidence interval for the mean monthly
sales when the store spends $400 on advertising.

27
EXAMPLE
 Predict the monthly sales for next month if $400 is spent
on advertising. Use a 95% prediction interval.

28
CONFIDENCE INTERVALS V.
PREDICTION INTERVALS

y
^
b xi
^
= b0
+ 1
^y i

x
x 29
REGRESSION RESULTS IN R

30
ESTIMATOR: UNBIASEDNESS (ACCURACY) AND
MINIMUM VARIANCE (PRECISION OR EFFICIENCY)

Unbiasedness (accuracy) and minimum variance

(precision/efficiency) are desirable properties of the probability
distribution of the sample statistic.
31
MODEL ASSUMPTIONS

32
MODEL ASSUMPTIONS
So far we only estimated deterministic component. Now we
turn our attention to random error ϵ. We first need some
modeling assumptions…
Assumption 1: E( /x) = E( ) = 0
The mean of the probability distribution of  is 0. This
implies mean value of y for a given value of x is 0 + 1x.
y = 0 + 1 x + 
Since, E( /x) = E( ) = 0,
E(y/x) = 0 + 1x
33
Sometimes, just written as E(y) = 0 + 1x
MODEL ASSUMPTIONS

Assumption 2: Homoskedasticity
• The variance of the probability distribution of 
is constant for all settings of the independent
variable x. For our straight-line model, this
assumption means that the variance of  is
equal to a constant, say 2, for all values of x.
• When this assumption does not hold, we say
we have a problem of heteroskedasticity.
34
MODEL ASSUMPTIONS

Assumption 3: Normality
The probability distribution of  is normal.
Assumption 4: No Autocorrelation
The values of  associated with any two observed
values of y are independent–that is, the value of 
associated with one value of y has no effect on the
values of  associated with other y values.
35
DISC: 203 – PROBABILITY &
STATISTICS
Practice Set – Simple Regression

36
EXERCISE 1
 For five popular cars, the following information on engine size and
mileage ratings was recorded:
Engine Size, x Mileage Rating,
(cubic inches) y

144 28

232 21

306 23

388 17

414 15

 Test the null hypothesis that x contributes no information for the

37
prediction of y against the alternative hypothesis that these variables
are linearly related with a slope significantly different from zero. Use
a significance level of 5%. Interpret results.
EXERCISE 1

 H0: 1 = 0 Test Statistic:

 Ha: 1  0 t = -4.337
   .05

 df 5–2=3
 Critical Value(s): Decision:
Reject H0 Reject H0 Reject Ho at  = .05
.025 .025 Conclusion:
There is evidence of a
-3.182 0 3.182 t relationship
38
Since size is
statistically
significant, we can
interpret its value: if
size increases by 10 As p-value
cubic inches, =0.0226<0.05, the
mileage rating effect of size on
decreases by 0.43 mileage is
points on average. statistically
significant.

We are 95%
confident, an
increase in size by
10 cubic inches,
decreases mileage
rating by between
0.11 and 0.74 points.
EXERCISE 2
 A real estate broker has been collecting data on home sales so
that he can investigate the relationship between the dependent
variable, y = value of the purchased home, and the
independent variable, x=annual family income of buyer. Data
from six recent sales are shown in the following table:

annual family Value of home,

Test the null hypothesis that x contributes no income, x y (thousands of
(thousands of dollars)
information (at α=0.01) for the prediction of dollars)
y against the alternative that home value y
tends to increase as the annual family
income x increases.
15.2 33.8
17.4 48.9
22.0 49.5
24.6 61.0
40
29.8 63.8
38.0 92.5
EXERCISE 2

 H0: 1 = 0 Test Statistic:

 Ha: 1 > 0 t = 7.6
   .01

 df 6–2=4
 Critical Value(s): Decision:
Reject H0 Reject Ho at  = .01
.01 Conclusion:
There is evidence of a
0 3.747 t positive relationship
41
For a one-tail test,
use half the p-value.
Since
0.000805<0.01, we
reject Ho at 1%
significance level.

Academic Encounters Level 2 Teacher's Book
86% (7)
Academic Encounters Level 2 Teacher's Book
53 pages
Estimation of Causal Relationships I: Illustration 1
No ratings yet
Estimation of Causal Relationships I: Illustration 1
8 pages
Chapter14
No ratings yet
Chapter14
65 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Lecture10 Regression2 TS PDF
No ratings yet
Lecture10 Regression2 TS PDF
22 pages
Lecture10 - SIMPLE LINEAR REGRESSION
No ratings yet
Lecture10 - SIMPLE LINEAR REGRESSION
13 pages
2024-Lecture 11
No ratings yet
2024-Lecture 11
37 pages
10bda
No ratings yet
10bda
35 pages
Simple Linear Regression PDF
No ratings yet
Simple Linear Regression PDF
7 pages
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
No ratings yet
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
67 pages
F_Regression
No ratings yet
F_Regression
65 pages
Statistics For Business STAT130: Unit 8: Correlation and Regression Analysis
No ratings yet
Statistics For Business STAT130: Unit 8: Correlation and Regression Analysis
56 pages
Session 19-20 - ANT 5001 - 2019-21
No ratings yet
Session 19-20 - ANT 5001 - 2019-21
42 pages
CH 14
No ratings yet
CH 14
31 pages
Week 13
No ratings yet
Week 13
25 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Simple Regression
No ratings yet
Simple Regression
35 pages
Regression2024 MBA
No ratings yet
Regression2024 MBA
25 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Simple Lin Regress Inference
No ratings yet
Simple Lin Regress Inference
51 pages
Simple Regression
No ratings yet
Simple Regression
46 pages
Fundamentals of Business Statistics: 6E John Loucks
No ratings yet
Fundamentals of Business Statistics: 6E John Loucks
40 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Chuong 5 - Hoi Quy Don (SBE - 11e ch14)
No ratings yet
Chuong 5 - Hoi Quy Don (SBE - 11e ch14)
59 pages
Chapter11 - Simple Regression
No ratings yet
Chapter11 - Simple Regression
12 pages
AA3 - Linear Regression - 2024
No ratings yet
AA3 - Linear Regression - 2024
26 pages
Simple Regression
100% (1)
Simple Regression
50 pages
Week 11-1 - Lecture 14 - Student
No ratings yet
Week 11-1 - Lecture 14 - Student
42 pages
8-1 To 8-3 Simple - Lin - Regress - Inference
No ratings yet
8-1 To 8-3 Simple - Lin - Regress - Inference
49 pages
Chapter 12 11
No ratings yet
Chapter 12 11
15 pages
Qcm1 February 2015 424 Corrige
No ratings yet
Qcm1 February 2015 424 Corrige
10 pages
File4-Session3-Introduction To Regression
No ratings yet
File4-Session3-Introduction To Regression
50 pages
Sesi 15. Regression
No ratings yet
Sesi 15. Regression
79 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
L1 QM07 High Yield Notes
No ratings yet
L1 QM07 High Yield Notes
4 pages
05 Linear Regression 2
No ratings yet
05 Linear Regression 2
71 pages
Lecture 11 Simple Linear Regression
No ratings yet
Lecture 11 Simple Linear Regression
30 pages
Chapter 10 - 2 - 2
No ratings yet
Chapter 10 - 2 - 2
33 pages
Correlation Regression
100% (1)
Correlation Regression
7 pages
Week-4 BA Linear Regression
No ratings yet
Week-4 BA Linear Regression
16 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
IS4242 W3 Regression Analyses
No ratings yet
IS4242 W3 Regression Analyses
67 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Simple Regression and Correlation
No ratings yet
Simple Regression and Correlation
30 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
CH 14
No ratings yet
CH 14
12 pages
Sbe10 10 Simple Regression
No ratings yet
Sbe10 10 Simple Regression
100 pages
Ch14 Regresi Sederhana S1 OK
No ratings yet
Ch14 Regresi Sederhana S1 OK
33 pages
Simple Linear Regression
100% (1)
Simple Linear Regression
50 pages
Chapter 17
No ratings yet
Chapter 17
31 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Examples Correation and Regression
No ratings yet
Examples Correation and Regression
29 pages
L5 - Simple Linear Regression Students
No ratings yet
L5 - Simple Linear Regression Students
33 pages
Regression
No ratings yet
Regression
33 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
Circles (Geometry) Mathematics Question Bank
From Everand
Circles (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Short Run Statistical Process Control
No ratings yet
Short Run Statistical Process Control
21 pages
Research On Warehouse Management System Based On Association Rules
100% (1)
Research On Warehouse Management System Based On Association Rules
5 pages
How Research Instruments Are Validated
No ratings yet
How Research Instruments Are Validated
2 pages
FAP Slides Unit 1
No ratings yet
FAP Slides Unit 1
50 pages
Immediate download Business Research Methods 9th Edition William G. Zikmund ebooks 2024
100% (3)
Immediate download Business Research Methods 9th Edition William G. Zikmund ebooks 2024
71 pages
Stiv2013 Human Computer Interaction Evaluation Report Rubric The Equine Centre of Uum G10
No ratings yet
Stiv2013 Human Computer Interaction Evaluation Report Rubric The Equine Centre of Uum G10
4 pages
Compliance With Loadicator Testing and Its Records
No ratings yet
Compliance With Loadicator Testing and Its Records
4 pages
FMS Committee
No ratings yet
FMS Committee
5 pages
Minor Aiml
No ratings yet
Minor Aiml
7 pages
Documented Essay Examples
100% (2)
Documented Essay Examples
7 pages
Estimation Bootstrap of The Probability Distributions of The Loss Function of Taguchi and of The Signal-To-Noise Ratio
No ratings yet
Estimation Bootstrap of The Probability Distributions of The Loss Function of Taguchi and of The Signal-To-Noise Ratio
10 pages
Score Presentation Guidelines
No ratings yet
Score Presentation Guidelines
1 page
Analysis of Maths by Theosophical Reduction
No ratings yet
Analysis of Maths by Theosophical Reduction
37 pages
CV001H01UV Specification
No ratings yet
CV001H01UV Specification
7 pages
Approaches Main
No ratings yet
Approaches Main
23 pages
Summer Training Project Report
No ratings yet
Summer Training Project Report
58 pages
Examining The Placebo Effect Đã G P
No ratings yet
Examining The Placebo Effect Đã G P
40 pages
Continuum Funcional
No ratings yet
Continuum Funcional
6 pages
Hair 4e IM Ch03
No ratings yet
Hair 4e IM Ch03
20 pages
Marketing Research: Presented by Amit Dutta Bba 6 Semester
No ratings yet
Marketing Research: Presented by Amit Dutta Bba 6 Semester
11 pages
TAT (Thematic Apperception Test)
No ratings yet
TAT (Thematic Apperception Test)
36 pages
Paper Title (Write Full Title Here) : Author One, Co-Author One, Co-Author Two, Co-Author Three, Co-Author Four
No ratings yet
Paper Title (Write Full Title Here) : Author One, Co-Author One, Co-Author Two, Co-Author Three, Co-Author Four
2 pages
Brine Shrimp Lethality Assay
No ratings yet
Brine Shrimp Lethality Assay
5 pages
Social Life Cycle Assessment of Palm Oil Biodiesel: A Case Study in Jambi Province of Indonesia
No ratings yet
Social Life Cycle Assessment of Palm Oil Biodiesel: A Case Study in Jambi Province of Indonesia
7 pages
The Comprehensive Literature Review of Studies On The Relationship Between Carbonation (Co2) in Soft Drinks and Human Health
No ratings yet
The Comprehensive Literature Review of Studies On The Relationship Between Carbonation (Co2) in Soft Drinks and Human Health
64 pages
Audit Committee Questionnaire
No ratings yet
Audit Committee Questionnaire
3 pages
Donders, Pauwels - Does EU Policy Challenge The Digital Future of PBS
No ratings yet
Donders, Pauwels - Does EU Policy Challenge The Digital Future of PBS
18 pages
Educ 22 Measurement and Evaluation
No ratings yet
Educ 22 Measurement and Evaluation
4 pages
Fundamentals of The Analytic Hierarchy Process
No ratings yet
Fundamentals of The Analytic Hierarchy Process
2 pages

11 SimpleRegression

Uploaded by

11 SimpleRegression

Uploaded by

DISC 203 – PROBABILITY &

SIMPLE LINEAR REGRESSION

1 Lecturer: Muhammad Asim

Advertising - Sales Data

 SST = total sum of squares

 Total variation is made up of two parts:

SST  SSR  SSE

SST   (y i  y)2 SSR   (yˆ i  y)2 SSE   (y i  yˆ i )2

 The coefficient of determination is the portion of

SSR regression sum of squares

Interpretation: About 82% of the sample variation in

 We refer to s as the standard error of the regression

 So, when n is large, we use a z-statistic ~ N(0,1)

Rejection region: t< -tα Rejection region: |t|> tα/2

Where tα and tα/2 are based on (n-2) degrees of freedom

 H0: 1 = 0 Test Statistic:

 We can be 95% confident that the true mean increase in

Unbiasedness (accuracy) and minimum variance

 Test the null hypothesis that x contributes no information for the

 H0: 1 = 0 Test Statistic:

annual family Value of home,

 H0: 1 = 0 Test Statistic:

You might also like