0% found this document useful (0 votes)
83 views

Correlation Regression 1

The document discusses correlation and linear regression analysis. It provides examples of how sales volume may correlate with advertising expenditure. A scatter plot and correlation coefficient can show the strength and direction of correlation between two variables. Linear regression finds the linear relationship between a dependent variable (like sales) and one or more independent variables (like advertising, price). The regression equation can be used to predict future values and understand the impact of changes in independent variables.

Uploaded by

Anand
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views

Correlation Regression 1

The document discusses correlation and linear regression analysis. It provides examples of how sales volume may correlate with advertising expenditure. A scatter plot and correlation coefficient can show the strength and direction of correlation between two variables. Linear regression finds the linear relationship between a dependent variable (like sales) and one or more independent variables (like advertising, price). The regression equation can be used to predict future values and understand the impact of changes in independent variables.

Uploaded by

Anand
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Correlation

Sales and Advertising


• Consider the following data of sales volume and
advertising expenditure for 5 weeks.
Advertising ($100) : 5 3 7 2 8
Sales ($1,000) : 11 8 10 6 12
• Observe that sales volume is increasing with
advertising expenditure.
14
12
Sales ($1,000)

10
8
6
4
1 3 5 7 9
Advertising ($100)
• Sometimes two variables are related to each
other.
• The values of both of the variables are paired.
• Change in the value is reflected in the change of
the value of other.
• Usually these two variables are two attributes of
each member of the population
• For Example:
Height Weight
Advertising Expenditure Sales Volume
Unemployment Crime Rate
Rainfall Food Production
Expenditure Savings
Scatter Plots

X
Positively Correlated Negatively Correlated

Loosely Correlated Strongly Correlated Not Correlated


Properties of Correlation Coefficient

• Values close to 0 indicate little or no correlation

• Values close to +1 indicate very strong positive


correlation.

• Values close to -1 indicate very strong negative


correlation.

• The sign (- or +) indicates the direction of the


relationship.
• Correlation Coefficient measures the strength of
linear relationship.

• r = 0 does not necessarily imply that there is no


correlation.

• It may be there, but is not a linear one.

• The correlation between two random variables


X and Y is a measure of the degree of linear
association between the two variables.
Caselet:
The operations manager of a musical instrument distributor
feels that demand for a particular type of guitar may be
related to the number of YouTube views for a music video
by the popular rock group Marble Pumpkins during the
preceding month. The manager has collected the data
shown in the following table:
YouTube Guitar
Views Sales 16
(1,000s) 14
Guitar Sales
30 8 12
40 11 10
8
70 12
6
60 10
40 20 60 80 100
80 15 YouTube Views (1,000s)
50 13 r = .77, moderately positively correlated
Regression
Regression Analysis
• Having determined the correlation between X and Y, we
wish to determine a mathematical relationship between
them.
• Dependent variable: the variable you wish to explain
• Independent variables: the variables used to explain the
dependent variable
• Regression analysis is used to:
 Predict the value of a dependent variable based on
the value of independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Types of Relationships

Linear relationships Curvilinear relationships

Y Y

X X

Y Y

X X
Types of Relationships

Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Types of Relationships
No relationship

X
Simple Linear Regression Analysis
• The simplest mathematical relationship is
• Y = a + bX + error (linear)
• Changes in Y are related to the changes in X
• What are the most suitable values of
 a (intercept) and b (slope)?

Y b
1
y = a + b.x
}a
X
Method of Least Squares
Y
(xi,yi)

ERROR

xi X
The best fitted line would be for which all the ERRORS
are minimum.
Least Squares Procedure
YouTube Views Vs Guitar Sales
YouTube Guitar
Views Sales
(1,000s)
30 8
40 11
70 12
60 10
80 15
50 13

Regression line: Y=a + b X


Y: Sales; X: Views
16
Guitar Sales 14
12
10
8
6
20 40 60 80 100
YouTube Views (1,000s)
16 y = 0.1x + 6
R² = 0.5932
14
Gutar Sales

12

10

6
20 30 40 50 60 70 80 90
YouTube Views (1,000s)

• Predict Sales for 90,000 YouTube Views


Put X=90 in regression equation, solve
Y= .1(90)+ 6 = 15
Coefficient of Determination : R2
• R2 is used to examine the adequacy of the fitted
linear model to the given data.

• The fraction of total variation in dependent


variable as explained by Regression is given by R2

• Can be computed as square of correlation


coefficient.
Note: If r=0.9 then r2=0.81 and so is R2 .

Thus, 81% of the total variation in y is explained by


the regression line developed using x or explained by
the variation in x.

Remaining 19%(=100-81)is due to or explained by


some other factors not captured.
• 0 ≤ R2 ≤ 1
• R2 close to 1 means that regression explains most of
the variability in Y. (Fit is good)
• R2 close to 0 means that regression does not explain
much variability in Y. (Fit is not good)
Multiple Linear Regression
• In simple linear regression analysis, we fit the linear
relation between one independent variable (X) and
one dependent variable (Y).
• We regress Y on only one regressor variable X.
• In some situations, Y may depend on more than one
regressor variables.
• Y is regressed on more than one regressor variable.
• For Example:
• Cost -> Labor cost, Electricity cost, Raw material
cost
• Salary -> Education, Experience
• Sales -> Cost, Advertising Expenditure
•Multiple linear regression model is
• Y  β  β X  β X  β X
i 0 1 1i 2 2i k (i  1,2,, n.)
ki

• Yi is dependent variable
•X1, X2, …, Xk are independent variables.
•β0, β1, β2, …, βk are regression coefficients
•We estimate unknown regression
coefficients β0, β1, β2, …, βk using the
observed data.
• Example:
• A distributor of frozen dessert pies wants to
evaluate factors which influence the demand
• Dependent variable:
• Pie sales (units per week)
• Independent variables:
• Price (in $)
• Advertising Expenditure ($100’s)
• Data collected for 15 weeks
Pie Price Advertising
Week Sales ($) ($100s) Coefficients
1 350 5.50 3.3 Intercept 306.5261933
2 460 7.50 3.3 Price ($) -24.97508952
3 350 8.00 3.0 Adv ($100s) 74.13095749
4 430 8.00 4.5
5 350 6.80 3.0 Multiple regression equation:
6 380 7.50 4.0
Sales = b0 +b1X1 + b2X2
7 430 4.50 3.0
8 470 6.40 3.7 Where X1 = Price
9 450 7.00 3.5 X2 = Advertising
10 490 5.00 4.0
11 340 7.20 3.5
Sales = 306.53 – 24.98 X1 + 74.13X2
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7
Sales  306.53 - 24.98(X1 )  74.13(X2 )

b1 = -24.98: sales will decrease, on b2 = 74.13: sales will


average, by 24.98 pies per week for increase, on average, by
each $1 increase in selling price, 74.13 pies per week for
while advertising expenses are kept each $100 increase in
fixed. advertising, while selling
price are kept fixed.
• Predict sales for a week in which the selling price
is $5.50 and advertising is $350:
• Sales = 306.53 – 24.98 X1 + 74.13 X2
• = 306.53 – 24.98 (5.50) + 74.13 ( 3.5)
• = 428.62

Predicted sales is 428.62 pies

Note that Advertising is in $100’s, so $350 means


that X2 = 3.5

You might also like