Chapter 10 Regression Slides
Chapter 10 Regression Slides
for
Chapter 10
Correlation and
Regression:
Explaining Association
and Causation
Marketing Research
Text and Cases
by
Rajendra Nargundkar
Methods
There are basically two approaches to regression
A hit and trial approach .
A pre- conceived approach.
Hit and trial Approach
In the hit and trial approach we collect data on a
large number of independent variables and then
try to fit a regression model with a stepwise
regression model, entering one variable into the
regression equation at a time.
The general regression model (linear) is of the
type
Y = a + b1x1 + b2x2 +.+ bnxn
Data
1. Input data on y and each of the x variables is
required to do a regression analysis. This data
is input into a computer package to perform the
regression analysis.
2. The output consists of the b coefficients for
all the independent variables in the model. The
output also gives you the results of a t test for
the significance of each variable in the model,
and the results of the F test for the model on
the whole.
Recommended usage
1. It is recommended by the author that for exploratory
research, the hit-and-trial approach may be used. But for
serious decision-making, there has to be a-priori
knowledge of the variables which are likely to affect y,
and only such variables should be used in the regression
analysis.
2. It is also recommended that unless the model is itself
significant at the desired confidence level (as evidenced
by the F test results printed out for the model), the R
value should not be interpreted.
Dependent Variable
Y =sales in Rs.lakhs in the territory
Independent Variables
X1 = market potential in the territory (in Rs.lakhs).
X2 = No. of dealers of the company in the territory.
X3 = No. of salespeople in the territory.
X4 = Index of competitor activity in the territory on
a 5 point scale (1=low, 5 = high level of
activity by competitors).
X5 = No. of service people in the territory.
X6 = No. of existing customers in the territory.
2
POTENTIAL
3
DEALERS
4
PEOPLE
5
COMPET
6
SERVICE
7
CUSTOM
25
20
60
150
12
30
50
20
45
15
25
11
30
10
20
45
75
12
20
30
10
16
15
29
18
30
22
43
16
40
29
70
15
39
10
40
11
16
40
11
17
12
25
10
13
18
32
14
31
14
23
73
10
10
43
15
81
150
15
35
70
Correlation
Correlations (regdata1.sta)
POTEN
TL
1.00
.84
.88
.14
.61
.83
.94
DEAL
ERS
.84
1.00
.85
-.08
.68
.86
.91
PEOP COM
LE
PET
.88
.14
.85
-.08
1.00
-.04
-.04
1.00
.79
-.18
.85
-.01
.95
-.05
SERV
ICE
CUST SALES
OM
.61
.83
.94
.68
.86
.91
.79
.85
.95
-.18
-.01
-.05
1.00
.82
.73
.82
1.00
.88
.73
.88
1.00
First, let us look at the correlations of all the variables with each
other. The correlation table (output from the computer for the
Pearson Correlation procedure) is shown in Fig. 2. The values in
the correlation tables are standardised, and range from 0 to 1 (+ ve
and - ve).
STAT
MULTIPLE
REGRESS
Correlations
( regdata 1 sta)
Variable
POTENTIAL
DEALERS
PEOPLE
COMPET
SERVICE
CUSTOM
SALES
POTENTIAL
1.00
.84
.88
014
.61
.83
.94
DEALERS
.84
1.00
.85
-08
.68
.86
.91
PEOPLE
.88
.85
1.00
-.04
.79
.85
.95
COMPET
.14
-.08
-.04
1.00
-.18
-.01
-.05
SERVICE
.61
.68
.79
-.18
1.00
.82
.73
CUSTOM
.83
.86
.85
-.01
.82
1.00
.88
SALES
.94
.91
.95
-.05
.73
.88
1.00
Regression
We will first run the regression model of the
following form, by entering all the 6 'x' variables in
the model Y= a + b1x1 + b2x2 + b3x3 + b4x4 + b5x5 + b6x6
..Equation 1
and determine the values of a, b1, b2, b3, b4, b5, & b6.
Regression Output:
The results (output) of this regression model are in
Fig.4 in table form.
Column 4 of the table, titled B lists all the
coefficients for the model. According to this,
a (intercept) = -3.17298
b1 = .22685
b2 = .81938
b3 = 1.09104
b4 = -1.89270
b5 = -0.54925
b6 = 0.06594
STAT.
Regression summary for Dependent Variable: Sales
MULTIPLE R=.98831786 R2 =.97677220 Adjusted R2 = .96748108
REGRESS. F= (4.10) = 105.13 p<.00000 std.Error of estimate: 3.9637
N=15
BETA
St.Err.of
BETA
Intercept
St.Err.of
BETA
T(10)
P-level
-3.74194
4,8477683
-.77190
.458025
People
.390134
.115138
1.02822
.303453
3.38841
.006904
Potential
.462686
.117988
.23905
.60959
3.92147
.002860
Dealers
.180700
.102687
.90109
.512065
1.75971
.108955
Compet
-.081195 .053434
-1.81074
-1.191624
-1.51955 .159589
STAT.
Regression summary for Dependent Variable: SALES
MULTIPLE R=.97975624 R2 =.95992229 Adjusted R2 = .95324267
REGRESS. F= (2, 12) = 143.71p<.00000 std.Error of estimate: 4,7528
N=15
BETA
St.Err.
of BETA
Intercept
St.Err.
of BETA
T(10)
P-level
-10,6164
2,659532
- 3,99183 .001788
POTENTL
.470825
.120127
.2433
..062065
3,91939
..000728
PEOPLE
.540454
.120127
.1.4244
.316602
4.49902
.000728
Additional comments
1. As we can see from the example discussed,
regression analysis is a very simple (particularly on a
computer), and useful techniques to predict one metric
dependent variable based on a set of metric
independent variables. Its use, however, gets more
complex, for instance, if the independent variables are
nominally scaled into two (dichotomous) or more
(polytomous) categories.