PESU Unit 4 MA
PESU Unit 4 MA
Brijesh Singh
Department of Management Studies
1
MARKETING ANALYTICS
Factor Analysis for Data Reduction
Brijesh Singh
Department of Management Studies
2
MARKETING ANALYTICS
Introduction
2. The worked out example in the chapter will help clarify the
use of Factor Analysis in Marketing Research
The input data containing responses of twenty respondents to the 10 statements are in Appendix 1, in the
form of a 20 Row by 10 column matrix (reproduced below).
QUESTION NO.
S. No. 1 2 3 4 5 6 7 8 9 10
1 1 4 1 6 5 6 5 2 3 2
2 2 3 2 4 3 3 3 5 5 2
3 2 2 2 1 2 1 1 7 6 2
4 5 1 4 2 2 2 2 3 2 3
5 1 2 2 5 4 4 4 1 1 2
6 3 2 3 3 3 3 3 6 5 3
7 2 2 5 1 2 1 2 4 4 5
8 4 4 3 4 4 5 3 2 3 3
9 2 3 2 6 5 6 5 1 4 1
10 1 4 2 2 1 2 1 4 4 1
Table contd on next slide...
MARKETING ANALYTICS
Example – Input Data
QUESTION NO.
S. 1 2 3 4 5 6 7 8 9 10
No.
11 1 5 1 3 2 3 2 2 2 1
12 1 6 1 1 1 1 1 1 2 2
13 3 1 4 4 4 3 3 6 5 3
14 2 2 2 2 2 2 2 1 3 2
15 2 5 1 3 2 3 2 2 1 6
16 5 6 3 2 1 3 2 5 5 4
17 1 4 2 2 1 2 1 1 1 3
18 2 3 1 1 2 2 2 3 2 2
19 3 3 2 3 4 3 4 3 3 3
20 4 3 2 7 6 6 6 2 3 6
MARKETING ANALYTICS
Steps in Factor Analysis
MARKETING ANALYTICS
Steps in Factor Analysis: The Correlation Matrix
MARKETING ANALYTICS
Steps in Factor Analysis: The Correlation Matrix
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Extraction
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Extraction
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Extraction
MARKETING ANALYTICS
Interpretation of the Output
1. The first step in interpreting the output is to look at the factors extracted, their eigen values and
the cumulative percentage of variance (fig 3, reproduced below).
1. We note that three factors have been extracted, based on our criterion
that only Factors with eigen values of 1 or more should be extracted.
We see from the Cum. Pct. (Cumulative Percentage of Variance
Explained) column in Fig. 3 that the three factors extracted together
account for 80.3 percent of the total variance (information contained
in the original ten variables). This is a pretty good bargain, because
we are able to economise on the number of variables (from 10 we
have reduced them to 3 underlying factors), while we lost only about
20 percent of the information content (80 percent is retained by the 3
factors extracted out of the 10 original variables).
2. This represents a reasonably good solution for our problem.
MARKETING ANALYTICS
Statistics Associated with Factor Analysis
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Rotation
MARKETING ANALYTICS
Output – Factor Matric (Unrotated)
Fig. 2: Factor Matrix (Unrotated)
Brijesh Singh
Department of Management Studies
29
MARKETING ANALYTICS
Cluster Analysis
3. If clusters are formed of customers similar to one another, then cluster analysis
can help marketers identify segments (clusters)
4. If clusters of brands are formed, this can be used to gain insights into brands
that are perceived as similar to each other on a set of attributes
5. This chapter explains the use of cluster analysis for customer segmentation
6. Cluster analysis is best performed when the variables are interval or ratio-
scaled
30
MARKETING ANALYTICS
Cluster Analysis
31
MARKETING ANALYTICS
Cluster Analysis
32
MARKETING ANALYTICS
Ideal Clustering vs Practical Clustering
33
MARKETING ANALYTICS
Cluster Analysis vs Factor Analysis
34
MARKETING ANALYTICS
Cluster Analysis vs Discriminant Analysis
35
MARKETING ANALYTICS
Application of Cluster Analysis
36
MARKETING ANALYTICS
How Does Cluster Analysis Work?
37
MARKETING ANALYTICS
Measuring Similarity
38
MARKETING ANALYTICS
Similarity Measure
39
MARKETING ANALYTICS
Distance Measures
40
MARKETING ANALYTICS
How do we form clusters?
41
MARKETING ANALYTICS
Dendrogram
42
MARKETING ANALYTICS
Deriving Clusters
43
MARKETING ANALYTICS
Hierarchical Cluster Analysis
44
MARKETING ANALYTICS
Cluster Analysis
4. In stage 2, again the two closest objects form another cluster. Now,
we have two clusters, and 6 unclustered objects. This means a total of
eight clusters, two with two objects each, and six with one object each.
2. This can be done if we have a hypothesis that the objects will group
into a certain number of clusters. Alternatively, we can first do a
hierarchical clustering on the data, find the approximate number of
clusters, and then perform a k-means clustering
Problem: A major Indian FMCG company wants to map the profile of its target market in terms
of lifestyle, attitudes and perceptions. The company's managers prepare, with the help of their
marketing research team, a set of 15 statements, which they feel measure many of the
variables of interest. These 15 statements are given below. The respondent had to agree or
disagree (1 = Strongly Agree, 2 = Agree, 3 = Neither Agree nor Disagree, 4 = Disagree, 5 =
Strongly Disagree) with each statement.
1. I prefer to use e-mail rather than write a letter.
2. I feel that quality products are always priced high.
3. I think twice before I buy anything.
4. Television is a major source of entertainment.
5. A car is a necessity rather than a luxury.
6. I prefer fast food and ready to use products.
7. People are more health conscious today.
8. Entry of foreign companies has increased the efficiency of Indian companies.
9. Women are active participants in purchase decisions.
10. I believe politicians can play a positive role.
11. I enjoy watching movies.
12. If I get a chance, I would like to settle abroad.
13. I always buy branded products.
14. I frequently go out on weekends.
48
15. I prefer to pay by credit card rather than in cash.
MARKETING ANALYTICS
Worked Example
var1 var2 var3 var4 var5 var6 var7 Var8
51
MARKETING ANALYTICS
Agglomeration Schedule
3. We will look at this figure from the last row upwards, because we would like to have lowest
possible number of clusters, for reasons of economy and ease of interpretation. We see that there
is a difference of (58.15 – 51.79) in the coefficients between the 1 cluster solution (stage 19) and
the 2 cluster solution (stage 18). This is a difference of 6.36. The next difference is of (51.79 –
47.00) which is equal to 4.79 (between stage 18, the 2 cluster solution and stage 17, the 3 cluster
solution). The next one after that is (47-46.34), only 0.66, between stage 17 and stage 16. After
this, there is again a large difference between the 4 cluster and 5 cluster solutions, of (46.34 –
41.660) or 4.68. Thereafter, the differences are smaller between subsequent rows of coefficients.
4. A large difference in the coefficient values between any two rows indicates a solution pertaining
to the number of clusters which the lower row represents. Ignoring the first difference of 6.36
which would indicate only 1 cluster in the data, we look at the next largest differences. 4.79 is the
difference between row 2 from the bottom and row 3 from the bottom, indicating a 2 cluster
solution. But almost the same is the difference between stage 16 and 15, indicating a 4 cluster
solution. At this point, it is the judgement of the researcher, which should decide whether to go for
a 2 cluster or a 4 cluster solution. Just for illustration, we will choose the 4 cluster solution.
54
MARKETING ANALYTICS
Interpretation of the Output
56
MARKETING ANALYTICS
Interpretation of the Output
1. The final cluster centres (above) describe the mean value of each variable for each of the 4
clusters. For example, cluster 1 is described by the mean values of variable 1 = 2.2, variable 2 =
2.2, variable 3 = 3.8, variable 4 = 3.2 and so on. Similarly, cluster 3 is described by variable 1 =
1.75, variable 2 = 2.0, variable 3 = 2.25, and variable 4 = 3.0, and so on.
2. We now go back to the original variables (in this case the 15 statements in our questinnaire),
and interpret the clusters in terms of the 15 variables. For example, cluster 3 consists of people
who are on the e-mail rather than writing conventional letters (variable 1 value = 1.75 which is
equivalent to “agree” on the scale of 1 to 5). Similarly, they are also people who tend to think
twice before buying anything (variable 3 value 2.25) in other words, careful spenders. They also
agree (variable 2 value = 2.00) that quality products are always priced high – that is, they have a
positive correlation in their minds about a product’s quality and price.
3. On these same variables, cluster 2 shows people who prefer conventional mail to e-mail
(variable 1 value = 3.5 or close to “disagree”), people who do not necessarily associate high price
with good quality (variable 2 value = 3.67), and tend to be neutral about care in spending
(variable 3 value = 2.67). In this way, when we compare final cluster centre values on each of the
15 variables, for 1 cluster at a time, a complete picture of the clusters emerges.
57
MARKETING ANALYTICS
Interpretation of the Output
E-mail users, feel quality comes at a price, not careful spenders, do not like
television much, do not think a car is a necessity, do not like fast food and ready to
use products, are not sure whether people are more health-conscious today, think
foreign companies have increased somewhat the efficiency of Indian companies,
disagree that women are active purchasing decision makers, feel that politicians
can play an active role, do not enjoy watching movies, might consider settling
abroad, tend to buy branded products, do not go out much on weekends and like
to pay cash, rather than charging to their credit cards (if they have one).
It is thus a cluster exhibiting many traditional values, except that they have
adapted to email use. They are also beginning to loosen their purse strings, and
are probably in transition in some other factors like acceptance of women as
decision makers and the advent of credit cards. 58
MARKETING ANALYTICS
Interpretation of the Output
Cluster 2
Cluster 3
This group is not a free spending one, but health conscious, more
patriarchical, more brand loyal to branded products, but outgoing
compared to other groups, even willing to go abroad to settle.
60
MARKETING ANALYTICS
Interpretation of the Output
Cluster 4
Not too particular about e-mail, measure quality by price, free spending, enjoy
watching TV, think a car is necessary, not fond of fast food, think people are
health conscious, do not think foreign companies have made us efficient,
believe in woman power, somewhat positive about politicians, not movie
watchers, do not want to settle abroad, indifferent to branding, moderately
outgoing and moderately in favour of credit cards rather than cash.
This group is optimistic, free spending and a good target for TV advertising,
particularly consumer durables and entertainment. But they are not
necessarily influenced by brands. They may want value for money, but if they
see value, they may spend a lot.
ANOVA:
Fig. 8 : Analysis of Variance
Variable Cluster MS DF Error MS DF F Prob
VAR00001 3.0500 3 1.315 16.0 2.3183 .114
VAR00002 3.0722 3 1.083 16.0 2.8359 .071
VAR00003 2.5722 3 1.630 16.0 1.5778 .234
VAR00004 1.6333 3 .943 16.0 1.7307 .201
VAR00005 2.5056 3 1.605 16.0 1.5609 .238
VAR00006 1.7056 3 1.505 16.0 1.1331 .365
VAR00007 9.6500 3 .390 16.0 24.7040 .000
VAR00008 8.5500 3 .681 16.0 12.5505 .000
VAR00009 1.3000 3 1.865 16.0 .6968 .567
VAR00010 5.5.56 3 .730 16.0 7.5397 .002
VAR00011 2.7389 3 1.020 16.0 2.6830 .082
VAR00012 4.0833 3 1.293 16.0 3.1562 .054
VAR00013 7.2556 3 .799 16.0 9.0813 .001
VAR00014 1.6222 3 1.880 16.0 .8628 .480
VAR00015 2.8500 3 1.465 16.0 1.9446 .163
62
MARKETING ANALYTICS
Interpretation of the Output
Objects
We have looked at an example of classifying people, with interval-scaled
data. It is possible to classify objects such as brands, products, cities, etc.
with cluster analysis. For example, which brands are clustered together in
terms of consumer perceptions for a positioning exercise, or which cities are
clustered together in terms of income, education and age profile of its
residents.
Number of Clusters
One of the main decisions of a researcher is to decide how many clusters
are present in the data. In certain cases, if for example we have a prior
hypothesis about how many clusters ought to be present, this decision may
already be made. But otherwise, it tends to be a subjective decision. One of
the criteria that can be used in addition to ones we have described in the
chapter is that every cluster must have a reasonable or minimum number of
objects. Which means, if a cluster comes out with only one or two objects in
it, look for another solution.
It may be useful to experiment with two or three possible solutions before 64
deciding on the number of clusters.
MARKETING ANALYTICS
Additional Comments on Cluster Analysis
Variables
Once the reader is aware of the basics of cluster analysis, he can begin to use it
creatively. For example, a cluster analysis can be done on some of the measured
variables, and then other variables can be checked to see if they also exhibit
differences across clusters. In the worked out example discussed earlier, only
Psychographics or behavioural variables were used to get the 4 clusters. We could
then see if they belonged to different places, had different education levels, or
whether one gender figured predominantly in any one of the clusters.
Scale
Cluster analysis is ideally suited to interval scaled variables, because Euclidean
distance is a commonly used distance measure used in the clustering process. But
nominal and ordinal level data can be used after standardisation if appropriate. This
may also necessitate the use of other measures of distance, more appropriate with
the scales of variables being used. But this should be done with care. In general, it is
a good idea to standardise the variables before clustering, if the units of 65
measurement are radically different.
MARKETING ANALYTICS
Additional Comments on Cluster Analysis
Statistical Tests
As mentioned briefly earlier, some statistical tests for cluster analysis are
available. But their validity being questionable, caution is recommended
in using either ANOVA or any other tests.
66
MARKETING ANALYTICS
Multi Dimension Scaling for Brand Positioning
Brijesh Singh
Department of Management Studies
67
MARKETING ANALYTICS
Multi Dimensional Scaling
1. The above method may not capture the consumer’s mind accurately.
70
MARKETING ANALYTICS
Multi Dimensional Scaling
71
MARKETING ANALYTICS
Use of MDS is to identify..
72
MARKETING ANALYTICS
Statistics and Terms associated with MDS
73
MARKETING ANALYTICS
Process to conduct MDS
74
MARKETING ANALYTICS
Formulate the Problem
75
MARKETING ANALYTICS
Input Data for MDS
76
MARKETING ANALYTICS
Obtain Input Data
77
MARKETING ANALYTICS
Input Data
78
MARKETING ANALYTICS
Input Data
79
MARKETING ANALYTICS
Input Data – Direct vs Derived Approach
80
MARKETING ANALYTICS
Input Data – Preference Data
81
MARKETING ANALYTICS
Select an MDS Procedure
82
MARKETING ANALYTICS
Decide on the number of dimensions
83
MARKETING ANALYTICS
Stress vs Dimensionality
84
MARKETING ANALYTICS
Label the dimensions
85
MARKETING ANALYTICS
Label the dimensions - Spatial Maps and Attributes
86
MARKETING ANALYTICS
Assess Reliability and Validity
87
MARKETING ANALYTICS
Multi Dimensional Scaling
89
MARKETING ANALYTICS
Interpretation of the Output
1. Fig. 1 takes the example of eight brands of TV available in the Indian market. Both the rows and
columns represent brands of TV. Eg: Var. 1 is brand 1, var. 2 is brand 2, and so on.
2. Input data were collected from a sample of respondents each of whom was asked to rate the
dissimilarity between all pairs of TV brands on a numerical scale
3. We will use multidimensional scaling to determine how these 8 brands are perceived by Indian
consumers, and plot a positioning map of the eight brands.
4. We will also attempt to find out how many dimensions the consumers seem to be using, when they 90
think of TV brands.
MARKETING ANALYTICS
Interpretation of the Output
1. In Figs. 2(a), 2(b), 3(a), 3(b), 4(a) and 4(b), we have the SPSS outputs of
multidimensional scaling on our data.
2. Figs. 2(a) and 2(b) contain the 3-dimensional solution. Figs 3(a) and 3(b) contain
the 2- dimensional solution. Figs. 4(a) and 4(b) contain the 1-dimensional solution.
3. Our first task is to determine how many dimensions the data seems to indicate (in
which we feel the best solution exists). For this, we look at the stress value for
various solutions in different dimensions. From Figs. 2(a), 3(a) and 4(a), we see the
following values of stress.
• 3-dimensional solution : 0.05230
• 2-dimensional solution : 0.24015
• 1-dimensional solution : 0.43159
4. Clearly, the 1- dimensional solution is not a good one. Remember, the stress value
indicates lack of fit, so it should be as close to zero as possible. The 2- dimensional
solution is better, but the 3-dimensional solution looks the best, as the stress value is
a low 0.05.
91
MARKETING ANALYTICS
Interpretation of the Output
1. Let us assume we have decided that the 3-dimensional solution is the best,
based on the low stress value.
2. Then, our next task now would be to name the dimensions. For doing so, our
previous knowledge of the brands may become important. For example, let us
assume that the eight brands of TV were as follows :-
1. Aiwa
2. Videocon
3. LG
4. Samsung
5. Sony
6. Onida
7. Thomson
8. BPL
92
MARKETING ANALYTICS
Interpretation of the Output
If these had been the eight brands, then we look at the qualities of various attributes
offered by these brands either through our judgment or knowledge of the market or
through a survey of consumers, or a combination of these methods.
Fig. 2(b)
Stimulus Coordinates for 3 dimensional solution
Stimulus 1 2 3
1 VAR00001 1.9512 .2028 .0664
2 VAR00002 -.1995 1.3140 .7743
3 VAR00003 -.6043 -1.3429 .4680
4 VAR00004 -.9038 -.2969 -1.8497
5 VAR00005 .8931 -1.0092 -.0350
6 VAR00006 1.1045 .1529 -.7070
7 VAR00007 -1.1031 1.6088 -.1289
8 VAR00008 -1.1381 -.6295 1.4121
For example, we could look at the above 3 dimensional solution of multidimensional scaling, and the scores
for the eight brands on the 3 dimensions, and decide on the following names for the 3 dimensions - 93
MARKETING ANALYTICS
Interpretation of the Output
We could then look at the brand scores (positions) on the three dimensions and
conclude that some brands like BPL, and Videocon, currently enjoy a good brand
image, but brands like Aiwa, Onida and Thomson are leading in “Value for Money”
perceptions. Also, Videocon and Thomson may be perceived as having the best
after-sales service.
94
MARKETING ANALYTICS
Interpretation of the Output
95
MARKETING ANALYTICS
2 Dimensional Output
DIMENSION
7 1.5
2
1.0
0.5 1
97
MARKETING ANALYTICS
Additional Comments
Brijesh Singh
Department of Management Studies
[email protected]
99