0% found this document useful (0 votes)
4 views

Ch9- Correlation Regression

class notes of Quantitative Business Methods

Uploaded by

zainab.jh88
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Ch9- Correlation Regression

class notes of Quantitative Business Methods

Uploaded by

zainab.jh88
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Quantitative Business Methods

QBM
Chapter 9

Correlation
and
Regression

Two continuous
variables
• Correlation and regression are concerned
with the investigation of two continuous
variables.

• Previously we have only considered a single


variable - now we look at two associated
variables.
We might wish to know:
• If a relationship exists between those variables
• if so, how strong that relationship is
• what form that relationship takes.

• Can we make use of that relationship for predictive


purposes i.e. forecasting?

• Correlation describes the strength of the relationship.


It is not concerned with 'cause' and 'effect'.

• Regression describes the relationship itself in the


form of a straight line equation which best fits the
data.

4
• Some initial insight into the relationship between two
continuous variables can be obtained by plotting a scatter
diagram and simply looking at the resulting graph.

• Does the relationship seem to be linear or curved?


• If there appears to be a linear relationship, can it be
quantified.
• A correlation coefficient is calculated as the
measure of the strength of this relationship.
• Its symbol is 'r' and its value lies between
-1 and +1.

5
• Is the association between the two variables strong
enough to be useful?

• If the relationship is found to be significantly strong,


• its nature can be found using linear regression.
• This defines the equation of the straight line of
'best fit' through the bi-variate data, y = a + bx .
• For example, £x spent on Advertising is expected
to increase Sales by £y.

6
• The 'goodness of fit' can be calculated
to see how well the line fits the data.
• Once defined by an equation, the
relationship can be used for
predictive purposes.

7
Example
'Ice cream Sales' for a particular firm of
manufacturers and 'Average Monthly Temperature'
are:
Month Av. Temp Sales From this data we need:
°(C) (£'000)
• Scatter diagram
January 4 73
February 4 57 • Correlation coefficient
March 7 81 • Regression line
April 8 94
May 12 110
• Goodness of fit
June 15 124 • Prediction
July 16 134
August 17 139
September 14 124
October 11 103
November 7 81
December 5 80
Scatter diagrams
We look for a linear relationship with the
bivariate points plotted being reasonably
close to the, yet unknown, 'line of best fit'.
• Plot the independent Sales against Av erage Monthly Temperature
variable, x, on the
horizontal axis. 140
130

• Plot the potentially 120


110

dependent variable on the

Sales
100

vertical, y, axis. 90
80
• (Minitab output shown) 70
60
• Looks promising: a 50

straight line relationship, 5 10


Av.Temp.
15

with all points fairly close


to a 'line of best fit'.
• Pearson's Correlation Coefficient (r)
• (for quantitative data only)

• This quantifies the strength of a linear relationship

10
• Calculation of Correlation coefficient
• Input data to calculator
• Best to use of calculator in 'Type A+BX ' or 'LR
mode' as will be demonstrated in tutorials.
• (Method in specific calculator manual)
• (If without 'A+BX type' or 'LR mode' complex formulae and
methods are needed, also in textbook or handout.)
• Correlation coefficient, r, (output from calculator):
r = 0.9833

11
Is this correlation coefficient, 0.9833, significant?
Hypothesis test for a Pearson’s correlation coefficient
• H0: There is no association between ice-cream sales and
average monthly temperature.
• H1: There is an association between them.
• Critical Value:
• Χ2 tables, 5%, 10 degrees of freedom = 0.576
• Test statistic: 0.983
• Conclusion: The test statistic exceeds the critical value
so we reject the Null Hypothesis, H0, and conclude that
there is a significant association between ice-cream sales
and average monthly temperature.

12
Regression equation (y = a + bx)
• There is a significant relationship between the two
variables, so the next step is to define it as a
regression equation.

• This can be produced directly from a calculator in


'A+BX type' or 'LR mode' as shown in your manual.
• (If without 'A+BX type' or 'LR mode' complex
formulae and methods are needed, also in
textbook or handout.)

13
• The regression line is described, in general, as
the straight line of ‘best fit’ with the equation:
• y = a + bx
• where x and y are the independent and dependent
variables, a the intercept on the y-axis, and b the
slope of the line.
• For this data are: a = 45.5 b = 5.45
• Giving the regression equation:
• y = 45.5 + 5.45x
14
Draw this line on the scatter diagram:

Plot any three points and


join them up. Scatter diagram with Regression line
Useful points: (0,a); the
centroid ( ); any other
Sales against Av erage Monthly Temperature

points calculated from the


140
130

regression equation:
120
110

Sales
100

E.g. If x = 15; 90
80

y = 45.5 + 5.45x15=127.2 70
60
50
5 10 15

For any value of x the Av.Temp.

corresponding value of y
can be found directly from
the calculator [ŷ].
Goodness of Fit
• How well does this line fit the data?

• Goodness of fit is measured by (r2 x 100)%.

• The correlation coefficient 'r' was 0.983 so Goodness of Fit =


(0.983)2 x 100 = 96.6% fit.

• This indicates the percentage of the variation in Ice-cream Sales


accounted for by the variation in Average monthly temperature.

16
Prediction of Sales

• Suppose that the Ice-cream manufacturer knows that the


estimated average temperature for the following month is
14oC, what would be his expected Sales?

• Substitute 14 for the independent variable, x, and


calculate the corresponding value of the Sales, y. This can
be more easily be produced directly from your calculator:
type in 14, find [ŷ].

• Estimated Sales: 45.5 + 5.45 x av. temp.


45.5 + 5.45 x 14 = 121.8

• Expected sales would be £122 000 17


Further regression modelling

• This lecture has concentrated on the production of a


regression model but has not gone dn to decide how
good this model is.

• At the moment we have only one model but further


exploration by residual analysis is essential for
comparing models.

• For residual analysis to see if this is a good model see


Section 9.8 of Business Statistics for Non-
Mathematicians and the computer worksheets.
18
• Your computer worksheet enables you to produce
two different models and compare their merits.
Further regression methods, applicable if your data is
not liners, are discussed in Section 9.11.

• All the formulae and method needed to carry out


correlation and regression follow in the Appendices
but it is hoped that you see the merits of investing in
a calculator that does it all for you!

19
In this lecture we have concentrated
Next lecture:
Summary

In this Chapter we have looked at


Questions

You might also like