0% found this document useful (0 votes)
5 views

PESU Unit 4 MA

The document discusses Factor Analysis as a technique for data reduction in marketing analytics, focusing on grouping variables into factors that capture the variance of the original data. It outlines the process of conducting Factor Analysis, including extraction and rotation of factors, and provides an example involving two-wheeler owners' perceptions. Additionally, it touches on Cluster Analysis for market segmentation, emphasizing its utility in identifying customer segments and brand similarities.

Uploaded by

disha mahesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

PESU Unit 4 MA

The document discusses Factor Analysis as a technique for data reduction in marketing analytics, focusing on grouping variables into factors that capture the variance of the original data. It outlines the process of conducting Factor Analysis, including extraction and rotation of factors, and provides an example involving two-wheeler owners' perceptions. Additionally, it touches on Cluster Analysis for market segmentation, emphasizing its utility in identifying customer segments and brand similarities.

Uploaded by

disha mahesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

MARKETING ANALYTICS

Brijesh Singh
Department of Management Studies

1
MARKETING ANALYTICS
Factor Analysis for Data Reduction

Brijesh Singh
Department of Management Studies
2
MARKETING ANALYTICS
Introduction

1. Factor Analysis is a set of techniques used for understanding variables


by grouping them into “factors” consisting of similar variables

2. It can also be used to confirm whether a hypothesized set of variables


groups into a factor or not

3. It is most useful when a large number of variables needs to be reduced


to a smaller set of “factors” that contain most of the variance of the
original variables

4. Generally, Factor Analysis is done in two stages, called


• Extraction of Factors and
• Rotation of the Solution obtained in stage

5. Factor Analysis is best performed with interval or ratio-scaled


variables
MARKETING ANALYTICS
Assumptions
MARKETING ANALYTICS
Application Areas/Example

1. In marketing research, a common application area of Factor


Analysis is to understand underlying motives of consumers who
buy a product category or a brand

2. The worked out example in the chapter will help clarify the
use of Factor Analysis in Marketing Research

3. In this example, we assume that a two wheeler manufacturer


is interested in determining which variables his potential
customers think about when they consider his product
MARKETING ANALYTICS
Application Areas/Example

4. Let us assume that twenty two-wheeler owners were surveyed


by this manufacturer (or by a marketing research company on
his behalf). They were asked to indicate on a seven point scale
(1=Completely Agree, 7=Completely Disagree), their agreement
or disagreement with a set of ten statements relating to their
perceptions and some attributes of the two-wheelers.

5. The objective of doing Factor Analysis is to find underlying


"factors" which would be fewer than 10 in number, but would
be linear combinations of some of the original 10 variables
MARKETING ANALYTICS
Example

The research design for data collection can be stated as follows-


Twenty 2-wheeler users were surveyed about their perceptions and image attributes of
the vehicles they owned. Ten questions were asked to each of them, all answered on a
scale of 1 to 7 (1= completely agree, 7= completely disagree).

1. I use a 2-wheeler because it is affordable.


2. It gives me a sense of freedom to own a 2-wheeler.
3. Low maintenance cost makes a 2-wheeler very economical in the long run.
4. A 2-wheeler is essentially a man’s vehicle.
5. I feel very powerful when I am on my 2-wheeler.
6. Some of my friends who don’t have their own vehicle are jealous of me.
7. I feel good whenever I see the ad for 2-wheeler on T.V., in a magazine or on a
hoarding.
8. My vehicle gives me a comfortable ride.
9. I think 2-wheelers are a safe way to travel.
10. Three people should be legally allowed to travel on a 2-wheeler.
MARKETING ANALYTICS
Example – Input Data

The input data containing responses of twenty respondents to the 10 statements are in Appendix 1, in the
form of a 20 Row by 10 column matrix (reproduced below).

QUESTION NO.
S. No. 1 2 3 4 5 6 7 8 9 10

1 1 4 1 6 5 6 5 2 3 2
2 2 3 2 4 3 3 3 5 5 2
3 2 2 2 1 2 1 1 7 6 2
4 5 1 4 2 2 2 2 3 2 3
5 1 2 2 5 4 4 4 1 1 2
6 3 2 3 3 3 3 3 6 5 3
7 2 2 5 1 2 1 2 4 4 5
8 4 4 3 4 4 5 3 2 3 3
9 2 3 2 6 5 6 5 1 4 1
10 1 4 2 2 1 2 1 4 4 1
Table contd on next slide...
MARKETING ANALYTICS
Example – Input Data

QUESTION NO.
S. 1 2 3 4 5 6 7 8 9 10
No.
11 1 5 1 3 2 3 2 2 2 1
12 1 6 1 1 1 1 1 1 2 2
13 3 1 4 4 4 3 3 6 5 3
14 2 2 2 2 2 2 2 1 3 2
15 2 5 1 3 2 3 2 2 1 6
16 5 6 3 2 1 3 2 5 5 4
17 1 4 2 2 1 2 1 1 1 3
18 2 3 1 1 2 2 2 3 2 2
19 3 3 2 3 4 3 4 3 3 3
20 4 3 2 7 6 6 6 2 3 6
MARKETING ANALYTICS
Steps in Factor Analysis
MARKETING ANALYTICS
Steps in Factor Analysis: The Correlation Matrix
MARKETING ANALYTICS
Steps in Factor Analysis: The Correlation Matrix
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Extraction
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Extraction
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Extraction
MARKETING ANALYTICS
Interpretation of the Output

1. The first step in interpreting the output is to look at the factors extracted, their eigen values and
the cumulative percentage of variance (fig 3, reproduced below).

Fig. 3: Final Statistics

Variable Communal * Factor Eigenvalue Pact of Cum


ity Var Pct
VAR00001 .72243 * 1 3.88282 38.8 38.8
VAR00002 .45214 * 2 2.77701 27.8 66.6
VAR00003 .73056 * 3 1.37475 13.7 80.3
VAR00004 .94488 *
VAR00005 .95038 *
VAR00006 .91376 *
VAR00007 .95474 *
VAR00008 .79869 *
VAR00009 .77745 *
VAR00010 .78946 *
MARKETING ANALYTICS
Interpretation of the Output

1. We note that three factors have been extracted, based on our criterion
that only Factors with eigen values of 1 or more should be extracted.
We see from the Cum. Pct. (Cumulative Percentage of Variance
Explained) column in Fig. 3 that the three factors extracted together
account for 80.3 percent of the total variance (information contained
in the original ten variables). This is a pretty good bargain, because
we are able to economise on the number of variables (from 10 we
have reduced them to 3 underlying factors), while we lost only about
20 percent of the information content (80 percent is retained by the 3
factors extracted out of the 10 original variables).
2. This represents a reasonably good solution for our problem.
MARKETING ANALYTICS
Statistics Associated with Factor Analysis
MARKETING ANALYTICS
Steps in Factor Analysis: Factor Rotation
MARKETING ANALYTICS
Output – Factor Matric (Unrotated)
Fig. 2: Factor Matrix (Unrotated)

Factor 1 Factor 2 Factor 3


VAR00001 .17581 .66967 .49301
VAR00002 -.13577 -.60774 .25369
VAR00003 -.10651 .81955 .21827
VAR00004 .96647 -.03627 -.09745
VAR00005 .95098 .16594 -.13593
VAR00006 .95184 -.08442 -.02522
VAR00007 .97128 .09591 -.04636
VAR00008 -.32171 .77498 -.03757
VAR00009 -.06890 .73502 -.48213
VAR00010 .16143 .31862 -.81356
MARKETING ANALYTICS
Interpretation of the Output

1.Now, we try to interpret what these 3 extracted factors


represent. This we can accomplish by looking at figs 4
and 2, the rotated and unrotated factor matrices.

Fig. 4: Rotated Factor Matrix


Factor 1 Factor 2 Factor 3
VAR00001 .13402 .34749 .76402
VAR00002 -.18143 -.64300 -.07596
VAR00003 -.10944 .62985 .56742
VAR00004 .96986 -.06383 -.01338
VAR00005 .96455 .13362 .04660
VAR00006 .94544 -.13868 .02600
VAR00007 .97214 .02862 .09411
VAR00008 -.26169 .85203 .06517
VAR00009 .00891 .87772 -.08347
VAR00010 .07209 -.10990 .87874
MARKETING ANALYTICS
Interpretation of the Output
MARKETING ANALYTICS
Steps in Factor Analysis: Making Final Decisions
MARKETING ANALYTICS
Interpretation of the Output
MARKETING ANALYTICS
Interpretation of the Output

1. Now we will attempt to interpret factor 2. We look in fig 4, down


the column for Factor 2, and find that variables 8 and 9 have high
loadings of 0.85203 and 0.87772, respectively. This indicates that
factor 2 is a combination of these two variables.

2. But if we look at fig. 2, the unrotated factor matrix, a slightly


different picture emerges. Here, variable 3 also has a high loading on
factor 2, along with variables 8 and 9. It is left to the researcher
which interpretation he wants to use, as there are no hard and fast
rules. Assuming we decide to use all three variables, the related
statements are “low maintenance”, “comfort” and “safety” (from
statements 3, 8 and 9). We may combine these variables into a factor
called “utility” or “functional features” or any other similar word or
phrase which captures the essence of these three statements /
variables.
MARKETING ANALYTICS
Interpretation of the Output

3. For interpreting Factor 3, we look at the column labelled factor


3 in fig. 4 and find that variables 1 and 10 are loaded high on
factor 3. According to the unrotated factor matrix of fig. 2, only
variable 10 loads high on factor 3. Supposing we stick to fig. 4,
then the combination of “affordability’ and “cost saving by 3
people legally riding on a 2-wheeler” give the impression that
factor 3 could be “economy” or “low cost”.

4. We have now completed interpretation of the 3 factors with


eigen values of 1 or more. We will now look at some additional
issues which may be of importance in using factor analysis.
MARKETING ANALYTICS
Additional Issues in Interpreting Solutions

1. We must guard against the possibility that a variable may load


highly on more than one factors. Strictly speaking, a variable should
load close to 1.00 on one and only one factor, and load close to 0 on
the other factors. If this is not the case, it indicates that either the
sample of respondents have more than one opinion about the
variable, or that the question/ variable may be unclear in its phrasing.

2. The other issue important in practical use of factor analysis is the


answer to the question ‘what should be considered a high loading and
what is not a high loading?” Here, unfortunately, there is no clear-cut
guideline, and many a time, we must look at relative values in the
factor matrix. Sometimes, 0.7 may be treated as a high value, while
sometimes 0.9 could be the cutoff for high values.
MARKETING ANALYTICS
Additional Issues (Contd.)
1. The proportion of variance in any one of the original variables which is
captured by the extracted factors is known as Communality. For example, fig. 3
tells us that after 3 factors were extracted and retained, the communality is
0.72243 for variable 1, 0.45214 for variable 2 and so on (from the column
labelled communality in fig. 3). This means that 0.72243 or 72.24 percent of the
variance (information content) of variable 1 is being captured by our 3 extracted
factors together. Variable 2 exhibits a low communality value of 0.45214. This
implies that only 45.214 percent of the variance in variable 2 is captured by our
extracted factors. This may also partially explain why variable 2 is not appearing
in our final interpretation of the factors (in the earlier section). It is possible that
variable 2 is an independent variable which is not combining well with any other
variable, and therefore should be further investigated separately. “Freedom”
could be a different concept in the minds of our target audience.
2. As a final comment, it is again the author’s recommendation that we use the
rotated factor matrix (rather than unrotated factor matrix) for interpreting
factors, particularly when we use the principal components method for
extraction of factors in stage 1.
MARKETING ANALYTICS
Cluster Analysis for Market Segmentation

Brijesh Singh
Department of Management Studies
29
MARKETING ANALYTICS
Cluster Analysis

1. A cluster, by definition, is a group of similar objects

2. There could be clusters of people, brands or other objects

3. If clusters are formed of customers similar to one another, then cluster analysis
can help marketers identify segments (clusters)

4. If clusters of brands are formed, this can be used to gain insights into brands
that are perceived as similar to each other on a set of attributes

5. This chapter explains the use of cluster analysis for customer segmentation

6. Cluster analysis is best performed when the variables are interval or ratio-
scaled

30
MARKETING ANALYTICS
Cluster Analysis

31
MARKETING ANALYTICS
Cluster Analysis

32
MARKETING ANALYTICS
Ideal Clustering vs Practical Clustering

33
MARKETING ANALYTICS
Cluster Analysis vs Factor Analysis

34
MARKETING ANALYTICS
Cluster Analysis vs Discriminant Analysis

35
MARKETING ANALYTICS
Application of Cluster Analysis

36
MARKETING ANALYTICS
How Does Cluster Analysis Work?

37
MARKETING ANALYTICS
Measuring Similarity

38
MARKETING ANALYTICS
Similarity Measure

39
MARKETING ANALYTICS
Distance Measures

40
MARKETING ANALYTICS
How do we form clusters?

41
MARKETING ANALYTICS
Dendrogram

42
MARKETING ANALYTICS
Deriving Clusters

43
MARKETING ANALYTICS
Hierarchical Cluster Analysis

44
MARKETING ANALYTICS
Cluster Analysis

1. There are two major classes of cluster analysis


techniques- hierarchical and non-hierarchical (K-Means)

2. In hierarchical clustering, some measure of distance is


used to identify distances between all pairs of objects to be
clustered. One of the popular distance measures used is
Euclidean Distance. Another is the Squared Euclidean
Distance

3. We begin with all objects in separate clusters. Say, we


have ten objects in separate clusters. Two closest objects are
joined to form a cluster. The remaining 8 objects would
remain separate. This is stage 1 of hierarchical clustering.
45
MARKETING ANALYTICS
Cluster Analysis

4. In stage 2, again the two closest objects form another cluster. Now,
we have two clusters, and 6 unclustered objects. This means a total of
eight clusters, two with two objects each, and six with one object each.

5. This process continues, until points join existing clusters (because


they are closest to an existing cluster), and clusters join other clusters,
based on the shortest distance criterion

6. In this way, a range of possible solutions is formed, from a 10-cluster


solution in the beginning, to a single cluster solution at the end.

7. We have to decide how many clusters the data seems to have,


depending on either the agglomeration schedule, or the dendrogram to
help make the decision. Both of these are computer outputs that
describe in numbers or visually, the sequence of cluster formation. This
decision is somewhat subjective, but there are some guidelines one can
46
follow, as illustrated in the worked example
MARKETING ANALYTICS
Cluster Analysis

1. In non-hierarchical clustering methods (also known as k-means


clustering methods), we need to specify the number of clusters we
want the objects to be clustered into.

2. This can be done if we have a hypothesis that the objects will group
into a certain number of clusters. Alternatively, we can first do a
hierarchical clustering on the data, find the approximate number of
clusters, and then perform a k-means clustering

3. In short, hierarchical clustering helps us to identify the number of


clusters and k-means clustering method characterizes the clusters.

4. In our illustration, we have used both hierarchical and non-


hierarchical methods in combination with one another

5. Let us move on to our worked example 47


MARKETING ANALYTICS
Worked Out Example

Problem: A major Indian FMCG company wants to map the profile of its target market in terms
of lifestyle, attitudes and perceptions. The company's managers prepare, with the help of their
marketing research team, a set of 15 statements, which they feel measure many of the
variables of interest. These 15 statements are given below. The respondent had to agree or
disagree (1 = Strongly Agree, 2 = Agree, 3 = Neither Agree nor Disagree, 4 = Disagree, 5 =
Strongly Disagree) with each statement.
1. I prefer to use e-mail rather than write a letter.
2. I feel that quality products are always priced high.
3. I think twice before I buy anything.
4. Television is a major source of entertainment.
5. A car is a necessity rather than a luxury.
6. I prefer fast food and ready to use products.
7. People are more health conscious today.
8. Entry of foreign companies has increased the efficiency of Indian companies.
9. Women are active participants in purchase decisions.
10. I believe politicians can play a positive role.
11. I enjoy watching movies.
12. If I get a chance, I would like to settle abroad.
13. I always buy branded products.
14. I frequently go out on weekends.
48
15. I prefer to pay by credit card rather than in cash.
MARKETING ANALYTICS
Worked Example
var1 var2 var3 var4 var5 var6 var7 Var8

1. 1.00 3.00 5.00 4.00 3.00 5.00 3.00 2.00


2. 2.00 3.00 2.00 3.00 4.00 4.00 3.00 2.00
3. 3.00 2.00 3.00 4.00 3.00 5.00 3.00 3.00 For the purpose of this illustration, we will
4. 3.00 2.00 4.00 2.00 2.00 4.00 3.00 4.00 assume that 20 respondents answered the
5. 2.00 2.00 4.00 2.00 2.00 5.00 3.00 3.00 questionnaire above (In a real life situation, the
6. 2.00 4.00 3.00 3.00 5.00 4.00 4.00 2.00 sample size would be higher). The input data
7. 1.00 1.00 2.00 4.00 4.00 1.00 2.00 4.00 matrix of 20 respondents x 15 variables is
8. 4.00 5.00 1.00 4.00 5.00 4.00 5.00 1.00 shown in fig 1.
9. 2.00 1.00 5.00 3.00 4.00 4.00 2.00 1.00
10. 5.00 2.00 4.00 3.00 2.00 5.00 1.00 5.00
11. 4.00 3.00 3.00 2.00 1.00 2.00 1.00 5.00
12. 3.00 4.00 4.00 4.00 3.00 2.00 5.00 1.00
13. 4.00 3.00 2.00 2.00 3.00 3.00 4.00 2.00
14. 1.00 2.00 2.00 4.00 2.00 5.00 1.00 3.00
15. 2.00 3.00 4.00 1.00 5.00 4.00 2.00 4.00
16. 3.00 2.00 1.00 3.00 4.00 3.00 2.00 3.00
17. 5.00 1.00 1.00 5.00 1.00 2.00 4.00 2.00
18. 3.00 5.00 5.00 3.00 5.00 5.00 5.00 1.00
19. 3.00 2.00 4.00 2.00 4.00 4.00 1.00 4.00 49
20. 1.00 3.00 3.00 2.00 2.00 5.00 2.00 5.00
MARKETING ANALYTICS
Interpretation of the Output
var9 var10 var11 var12 var13 var14 var15

1. 3.00 2.00 4.00 1.00 1.00 1.00 5.00


2. 2.00 2.00 4.00 2.00 2.00 2.00 4.00
3. 4.00 2.00 4.00 3.00 4.00 4.00 3.00
4. 5.00 4.00 5.00 4.00 5.00 5.00 5.00
5. 4.00 4.00 5.00 5.00 3.00 3.00 4.00
6. 3.00 4.00 5.00 4.00 3.00 3.00 3.00
7. 2.00 5.00 4.00 3.00 3.00 3.00 1.00
8. 1.00 5.00 3.00 3.00 5.00 5.00 2.00
9. 2.00 1.00 2.00 2.00 4.00 4.00 3.00
10. 3.00 2.00 5.00 1.00 2.00 2.00 1.00
11. 2.00 2.00 4.00 5.00 1.00 1.00 2.00
12. 5.00 3.00 2.00 4.00 4.00 4.00 3.00
13. 2.00 3.00 4.00 3.00 5.00 5.00 4.00
14. 5.00 4.00 3.00 2.00 2.00 2.00 5.00
15. 4.00 5.00 2.00 1.00 1.00 1.00 4.00
16. 2.00 5.00 1.00 2.00 5.00 5.00 3.00
17. 2.00 4.00 4.00 3.00 3.00 3.00 2.00
18. 2.00 3.00 4.00 4.00 2.00 2.00 1.00
19. 1.00 3.00 4.00 5.00 3.00 3.00 2.00 50
20. 1.00 3.00 4.00 4.00 3.00 3.00 3.00
MARKETING ANALYTICS
Interpretation of the Output

The computer output is obtained by first doing a


hierarchical cluster analysis to find the number of clusters
that exist in the data. These outputs are in figs 2 to 4
(Agglomeration schedule, vertical Icicle Plot and
Dendrogram using Average Linkage, respectively).

The second stage is a K-means (quick cluster) output with


a pre-determined number of clusters to be specified. In
this case, the output is for 4 clusters. We will look at both
stage 1 and stage 2 outputs to understand the
interpretation of both stages.

51
MARKETING ANALYTICS
Agglomeration Schedule

Clusters Stage Cluster 1st Appears Next


Combined
Stage Cluster Cluster Coefficient Cluster1 Cluster Stage
1 2 2
1 4 5 14.000000 0 0 5
2 19 20 16.000000 0 0 7
3 6 18 17.000000 0 0 9
4 1 2 17.000000 0 0 8
5 3 4 20.000000 0 1 11
6 13 16 25.000000 0 0 13
7 11 19 28.000000 0 2 11
8 1 14 28.500000 0 0 10
9 6 8 32.500000 0 0 12
10 1 15 34.666668 0 0 14
11 3 11 36.444443 0 7 15
12 6 12 36.666668 0 0 19
13 7 13 39.500000 0 6 17
14 1 9 41.000000 10 0 16
15 3 10 41.666668 11 0 16
16 1 3 46.342857 14 15 18
17 7 17 47.000000 13 0 18
18 1 7 51.791668 16 17 19
52
19 1 6 58.156250 18 12 0
MARKETING ANALYTICS
Interpretation of the Output

1. The agglomeration schedule, can help us to identify large


differences in the coefficient (4th column). The agglomeration
schedule from top to bottom (stage 1 to 19) indicates the sequence
in which cases get combined with others (or one cluster combines
with another), until all 20 cases are combined together in one
cluster at the last stage (stage 19).

2. Therefore, stage 19 represents a 1 cluster solution, stage 18


represents a 2 cluster solution, stage 17 represents a 3 cluster
solution, and so on, going up from the last row to the first row. We
have to identify how many clusters are in the data. We use the
difference between rows in a measure called coefficient (also
known as fusion coefficient) in column 4 to identify the number of
clusters in the data. 53
MARKETING ANALYTICS
Interpretation of the Output

3. We will look at this figure from the last row upwards, because we would like to have lowest
possible number of clusters, for reasons of economy and ease of interpretation. We see that there
is a difference of (58.15 – 51.79) in the coefficients between the 1 cluster solution (stage 19) and
the 2 cluster solution (stage 18). This is a difference of 6.36. The next difference is of (51.79 –
47.00) which is equal to 4.79 (between stage 18, the 2 cluster solution and stage 17, the 3 cluster
solution). The next one after that is (47-46.34), only 0.66, between stage 17 and stage 16. After
this, there is again a large difference between the 4 cluster and 5 cluster solutions, of (46.34 –
41.660) or 4.68. Thereafter, the differences are smaller between subsequent rows of coefficients.

4. A large difference in the coefficient values between any two rows indicates a solution pertaining
to the number of clusters which the lower row represents. Ignoring the first difference of 6.36
which would indicate only 1 cluster in the data, we look at the next largest differences. 4.79 is the
difference between row 2 from the bottom and row 3 from the bottom, indicating a 2 cluster
solution. But almost the same is the difference between stage 16 and 15, indicating a 4 cluster
solution. At this point, it is the judgement of the researcher, which should decide whether to go for
a 2 cluster or a 4 cluster solution. Just for illustration, we will choose the 4 cluster solution.

54
MARKETING ANALYTICS
Interpretation of the Output

Now, in stage 2, a k-means clustering is run with 4 cluster solution requested


(as identified from the hierarchical clustering done above). In the given
problem, figs 5, 6, 7 and 8 indicate the outputs of K-means clustering for a 4
cluster solution. These outputs give us the initial cluster centres, the case
listing of cluster membership (i.e., which case belongs to which of the
clusters), the final cluster centres (the solution) and an ANOVA table.
Fig. 7 : Final Cluster Centers

VAR00001 VAR00002 VAR00003 VAR00004


1 2.2000 2.2000 3.8000 3.2000
2 3.5000 3.6667 2.6667 3.5000
3 1.7500 2.0000 2.2500 3.0000
4 3.0000 2.4000 3.6000 2.2000

Cluster VAR00005 VAR00006 VAR00007 VAR00008


1 3.2000 4.4000 2.8000 2.4000
2 3.6667 3.3333 4.5000 1.5000
3 3.7500 3.2500 1.7500 3.5000
4 2.2000 4.2000 1.6000 4.4000 55
MARKETING ANALYTICS
Interpretation of the Output

Cluster VAR00009 VAR00010 VAR00011 VAR00012


1 3.2000 2.2000 3.8000 2.4000
2 2.5000 3.6667 3.6667 3.5000
3 3.2500 4.7500 2.5000 2.0000
4 2.2000 2.8000 4.4000 4.0000

Cluster VAR00013 VAR00014 VAR00015


1 2.4000 3.2000 4.0000
2 4.1667 3.6667 2.5000
3 1.2500 2.7500 3.2500
4 3.0000 2.4000 2.4000

56
MARKETING ANALYTICS
Interpretation of the Output

1. The final cluster centres (above) describe the mean value of each variable for each of the 4
clusters. For example, cluster 1 is described by the mean values of variable 1 = 2.2, variable 2 =
2.2, variable 3 = 3.8, variable 4 = 3.2 and so on. Similarly, cluster 3 is described by variable 1 =
1.75, variable 2 = 2.0, variable 3 = 2.25, and variable 4 = 3.0, and so on.

2. We now go back to the original variables (in this case the 15 statements in our questinnaire),
and interpret the clusters in terms of the 15 variables. For example, cluster 3 consists of people
who are on the e-mail rather than writing conventional letters (variable 1 value = 1.75 which is
equivalent to “agree” on the scale of 1 to 5). Similarly, they are also people who tend to think
twice before buying anything (variable 3 value 2.25) in other words, careful spenders. They also
agree (variable 2 value = 2.00) that quality products are always priced high – that is, they have a
positive correlation in their minds about a product’s quality and price.

3. On these same variables, cluster 2 shows people who prefer conventional mail to e-mail
(variable 1 value = 3.5 or close to “disagree”), people who do not necessarily associate high price
with good quality (variable 2 value = 3.67), and tend to be neutral about care in spending
(variable 3 value = 2.67). In this way, when we compare final cluster centre values on each of the
15 variables, for 1 cluster at a time, a complete picture of the clusters emerges.

57
MARKETING ANALYTICS
Interpretation of the Output

In this case, we will briefly describe each of the 4 clusters as follows:


Cluster 1

E-mail users, feel quality comes at a price, not careful spenders, do not like
television much, do not think a car is a necessity, do not like fast food and ready to
use products, are not sure whether people are more health-conscious today, think
foreign companies have increased somewhat the efficiency of Indian companies,
disagree that women are active purchasing decision makers, feel that politicians
can play an active role, do not enjoy watching movies, might consider settling
abroad, tend to buy branded products, do not go out much on weekends and like
to pay cash, rather than charging to their credit cards (if they have one).

It is thus a cluster exhibiting many traditional values, except that they have
adapted to email use. They are also beginning to loosen their purse strings, and
are probably in transition in some other factors like acceptance of women as
decision makers and the advent of credit cards. 58
MARKETING ANALYTICS
Interpretation of the Output

Cluster 2

Regular mail writers, bargain hunters or aggressive buyers, not too


particular about thinking before spending, not great valuers of TV, believe
the car is a luxury not too fond of fast food and convenience products, do
not think people are very health conscious, feel foreign companies have
done us good, think women are active purchasing decision makers, do not
believe in politicians, do not like movies, do not want to settle abroad, do
not stress on branded products, do not go out on weekends, but do prefer
credit cards for payments.

It is a group which likes to use credit, spends more freely, believes in


woman power, believe in economics rather than politics, and feel quality
products can be cheap. Also, they seem to have a patriotic streak, as they
do not want to settle abroad.
59
MARKETING ANALYTICS
Interpretation of the Output

Cluster 3

E-mail users, quality measured by price, think twice before buying,


indifferent to TV, car is a luxury to them, not too fond of fast food, agree
that people are health conscious, do not think foreign companies have
made us efficient, do not believe in woman power, detest politicians,
enjoy watching movies, willing to settle abroad, always buy branded
products, go out on weekends, slightly prefer cash to credit cards.

This group is not a free spending one, but health conscious, more
patriarchical, more brand loyal to branded products, but outgoing
compared to other groups, even willing to go abroad to settle.

60
MARKETING ANALYTICS
Interpretation of the Output

Cluster 4

Not too particular about e-mail, measure quality by price, free spending, enjoy
watching TV, think a car is necessary, not fond of fast food, think people are
health conscious, do not think foreign companies have made us efficient,
believe in woman power, somewhat positive about politicians, not movie
watchers, do not want to settle abroad, indifferent to branding, moderately
outgoing and moderately in favour of credit cards rather than cash.

This group is optimistic, free spending and a good target for TV advertising,
particularly consumer durables and entertainment. But they are not
necessarily influenced by brands. They may want value for money, but if they
see value, they may spend a lot.

In summary, the cluster analysis of this sample of respondents tells us a lot


about the possible segments which exist in the target population. 61
MARKETING ANALYTICS
Interpretation of the Output

ANOVA:
Fig. 8 : Analysis of Variance
Variable Cluster MS DF Error MS DF F Prob
VAR00001 3.0500 3 1.315 16.0 2.3183 .114
VAR00002 3.0722 3 1.083 16.0 2.8359 .071
VAR00003 2.5722 3 1.630 16.0 1.5778 .234
VAR00004 1.6333 3 .943 16.0 1.7307 .201
VAR00005 2.5056 3 1.605 16.0 1.5609 .238
VAR00006 1.7056 3 1.505 16.0 1.1331 .365
VAR00007 9.6500 3 .390 16.0 24.7040 .000
VAR00008 8.5500 3 .681 16.0 12.5505 .000
VAR00009 1.3000 3 1.865 16.0 .6968 .567
VAR00010 5.5.56 3 .730 16.0 7.5397 .002
VAR00011 2.7389 3 1.020 16.0 2.6830 .082
VAR00012 4.0833 3 1.293 16.0 3.1562 .054
VAR00013 7.2556 3 .799 16.0 9.0813 .001
VAR00014 1.6222 3 1.880 16.0 .8628 .480
VAR00015 2.8500 3 1.465 16.0 1.9446 .163
62
MARKETING ANALYTICS
Interpretation of the Output

The ANOVA table (fig. 8) tells us which of the 15


variables is significantly different across the 4 clusters.
The last column indicates that variables 2, 7, 8, 10, 11,
12, 13 are significant at the 0.10 level (equivalent to
90% confidence level) as they have prob. Values less
than 0.10. The other variables are not statistically
significant, as they all have prob. Values greater then
0.10. But there is divided opinion about the utility of
statistical testing for cluster analysis. Most established
writers seen to feel that these tests (ANOVA or other
tests) are not valid. Therefore, it is left to the
researcher’s judgement whether he would like to use
these in determining which variables are significant. If
the tests were used, then the interpretation of clusters
and differences across clusters should be only on the
basis of those variables which are (statistically)
significantly different across clusters at 0.10 or 0.05 or
some other level.
63
MARKETING ANALYTICS
Additional Comments on Cluster Analysis

Objects
We have looked at an example of classifying people, with interval-scaled
data. It is possible to classify objects such as brands, products, cities, etc.
with cluster analysis. For example, which brands are clustered together in
terms of consumer perceptions for a positioning exercise, or which cities are
clustered together in terms of income, education and age profile of its
residents.

Number of Clusters
One of the main decisions of a researcher is to decide how many clusters
are present in the data. In certain cases, if for example we have a prior
hypothesis about how many clusters ought to be present, this decision may
already be made. But otherwise, it tends to be a subjective decision. One of
the criteria that can be used in addition to ones we have described in the
chapter is that every cluster must have a reasonable or minimum number of
objects. Which means, if a cluster comes out with only one or two objects in
it, look for another solution.
It may be useful to experiment with two or three possible solutions before 64
deciding on the number of clusters.
MARKETING ANALYTICS
Additional Comments on Cluster Analysis

Variables
Once the reader is aware of the basics of cluster analysis, he can begin to use it
creatively. For example, a cluster analysis can be done on some of the measured
variables, and then other variables can be checked to see if they also exhibit
differences across clusters. In the worked out example discussed earlier, only
Psychographics or behavioural variables were used to get the 4 clusters. We could
then see if they belonged to different places, had different education levels, or
whether one gender figured predominantly in any one of the clusters.

Scale
Cluster analysis is ideally suited to interval scaled variables, because Euclidean
distance is a commonly used distance measure used in the clustering process. But
nominal and ordinal level data can be used after standardisation if appropriate. This
may also necessitate the use of other measures of distance, more appropriate with
the scales of variables being used. But this should be done with care. In general, it is
a good idea to standardise the variables before clustering, if the units of 65
measurement are radically different.
MARKETING ANALYTICS
Additional Comments on Cluster Analysis

Statistical Tests

As mentioned briefly earlier, some statistical tests for cluster analysis are
available. But their validity being questionable, caution is recommended
in using either ANOVA or any other tests.

A general caution about cluster analysis itself is that it tends to produce


different results with different methods and some methods are quite
vulnerable to errors in data. So, the stability of the clusters can be
checked through splitting the sample and repeating the cluster analysis.

66
MARKETING ANALYTICS
Multi Dimension Scaling for Brand Positioning

Brijesh Singh
Department of Management Studies
67
MARKETING ANALYTICS
Multi Dimensional Scaling

1. The most common and useful marketing application


of multidimensional scaling is in brand positioning.

2. Positioning is essentially concerned with mapping a


consumer’s mind and placing all the competing brands
of a product category in appropriate slots or “positions”
on it.

3. For example, a product category of shampoos could


be identified as having 5 attributes important to the
consumer - price, lather, fragrance, consistency and
favorable effects on hair.
68
MARKETING ANALYTICS
Multi Dimensional Scaling

4. If these were to be rated on a 7 point scale for say, six


leading brands of shampoo A, B, C, D, E and F, then we
could pickup any two attributes and plot the six brands on
a map according to the consumer ratings.

5. This is called a perceptual map of consumer perception


about competing brands in a product category. This is the
type of map useful for deliberate positioning of a new
brand, based on "gaps" in the current map, or for finding
out the current position of an existing brand on the map.
If the desired position of an existing brand owned by our
company is different from the one perceived by
consumers, an option is to "reposition" the brand.
69
MARKETING ANALYTICS
Multi Dimensional Scaling

1. The above method may not capture the consumer’s mind accurately.

2. If we assume that the consumer simultaneously thinks of several


product dimensions or attributes rather than one attribute at a time, the
above method is only an approximation of that process

3. Multidimensional scaling, on the other hand, captures the complex


interactions between attributes and brands in a particular way, and then
“derives” attributes or dimensions which explain the “positions” given
by consumers to various brands.

70
MARKETING ANALYTICS
Multi Dimensional Scaling

71
MARKETING ANALYTICS
Use of MDS is to identify..

72
MARKETING ANALYTICS
Statistics and Terms associated with MDS

73
MARKETING ANALYTICS
Process to conduct MDS

74
MARKETING ANALYTICS
Formulate the Problem

75
MARKETING ANALYTICS
Input Data for MDS

76
MARKETING ANALYTICS
Obtain Input Data

77
MARKETING ANALYTICS
Input Data

78
MARKETING ANALYTICS
Input Data

79
MARKETING ANALYTICS
Input Data – Direct vs Derived Approach

80
MARKETING ANALYTICS
Input Data – Preference Data

81
MARKETING ANALYTICS
Select an MDS Procedure

82
MARKETING ANALYTICS
Decide on the number of dimensions

83
MARKETING ANALYTICS
Stress vs Dimensionality

84
MARKETING ANALYTICS
Label the dimensions

85
MARKETING ANALYTICS
Label the dimensions - Spatial Maps and Attributes

86
MARKETING ANALYTICS
Assess Reliability and Validity

87
MARKETING ANALYTICS
Multi Dimensional Scaling

1. One of the important approach in multidimensional scaling


is Similarity/Dissimilarity based approach
2. It is very easy to understand intuitively, and quite useful in
gaining a good understanding of consumer psyche.
3. In the similarity/dissimilarity-based approach, we need
some kind of a distance measure between the brands being
rated. The distance measure being input could be a simple
ranking of distances between a brand and all other brands
by a customer.
4. One way to do this is to provide a customer (respondent)
with cards, each containing a pair of brands written on it,
and asking him to write down a number indicating the
difference between the two brands on any numerical scale
which can represent distance.
88
MARKETING ANALYTICS
Multi Dimensional Scaling

5. This is then repeated for all pairs of brands being included in


the research. No attributes are specified by which the
customer is asked to decide on the difference.
6. This distance measure or dissimilarity measure can be
compiled into a matrix of the type shown in Fig.1.

89
MARKETING ANALYTICS
Interpretation of the Output

Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8


Var1 .00 3.00 6.00 8.00 1.00 2.00 7.0 8.00
Var2 3.00 .00 4.00 6.00 4.00 5.00 2.00 5.00
Var3 6.00 4.00 .00 3.00 2.00 4.00 6.00 1.00
Var4 8.00 6.00 3.00 .00 3.00 5.00 4.00 7.00
Var5 1.00 4.00 2.00 3.00 .00 2.00 8.00 5.00
Var6 2.00 5.00 4.00 5.00 2.00 .00 3.00 6.00
Var7 7.00 2.00 6.00 4.00 8.00 3.00 .00 5.00
Var8 8.00 5.00 1.00 7.00 5.00 6.00 5.00 .00

1. Fig. 1 takes the example of eight brands of TV available in the Indian market. Both the rows and
columns represent brands of TV. Eg: Var. 1 is brand 1, var. 2 is brand 2, and so on.
2. Input data were collected from a sample of respondents each of whom was asked to rate the
dissimilarity between all pairs of TV brands on a numerical scale
3. We will use multidimensional scaling to determine how these 8 brands are perceived by Indian
consumers, and plot a positioning map of the eight brands.
4. We will also attempt to find out how many dimensions the consumers seem to be using, when they 90
think of TV brands.
MARKETING ANALYTICS
Interpretation of the Output

1. In Figs. 2(a), 2(b), 3(a), 3(b), 4(a) and 4(b), we have the SPSS outputs of
multidimensional scaling on our data.

2. Figs. 2(a) and 2(b) contain the 3-dimensional solution. Figs 3(a) and 3(b) contain
the 2- dimensional solution. Figs. 4(a) and 4(b) contain the 1-dimensional solution.

3. Our first task is to determine how many dimensions the data seems to indicate (in
which we feel the best solution exists). For this, we look at the stress value for
various solutions in different dimensions. From Figs. 2(a), 3(a) and 4(a), we see the
following values of stress.
• 3-dimensional solution : 0.05230
• 2-dimensional solution : 0.24015
• 1-dimensional solution : 0.43159

4. Clearly, the 1- dimensional solution is not a good one. Remember, the stress value
indicates lack of fit, so it should be as close to zero as possible. The 2- dimensional
solution is better, but the 3-dimensional solution looks the best, as the stress value is
a low 0.05.
91
MARKETING ANALYTICS
Interpretation of the Output

1. Let us assume we have decided that the 3-dimensional solution is the best,
based on the low stress value.

2. Then, our next task now would be to name the dimensions. For doing so, our
previous knowledge of the brands may become important. For example, let us
assume that the eight brands of TV were as follows :-

1. Aiwa
2. Videocon
3. LG
4. Samsung
5. Sony
6. Onida
7. Thomson
8. BPL

92
MARKETING ANALYTICS
Interpretation of the Output
If these had been the eight brands, then we look at the qualities of various attributes
offered by these brands either through our judgment or knowledge of the market or
through a survey of consumers, or a combination of these methods.

Fig. 2(b)
Stimulus Coordinates for 3 dimensional solution
Stimulus 1 2 3
1 VAR00001 1.9512 .2028 .0664
2 VAR00002 -.1995 1.3140 .7743
3 VAR00003 -.6043 -1.3429 .4680
4 VAR00004 -.9038 -.2969 -1.8497
5 VAR00005 .8931 -1.0092 -.0350
6 VAR00006 1.1045 .1529 -.7070
7 VAR00007 -1.1031 1.6088 -.1289
8 VAR00008 -1.1381 -.6295 1.4121

For example, we could look at the above 3 dimensional solution of multidimensional scaling, and the scores
for the eight brands on the 3 dimensions, and decide on the following names for the 3 dimensions - 93
MARKETING ANALYTICS
Interpretation of the Output

Dimension 1 : Value for Money


Dimension 2 : After Sales Service
Dimension 3 : Current Brand Image

We could then look at the brand scores (positions) on the three dimensions and
conclude that some brands like BPL, and Videocon, currently enjoy a good brand
image, but brands like Aiwa, Onida and Thomson are leading in “Value for Money”
perceptions. Also, Videocon and Thomson may be perceived as having the best
after-sales service.

94
MARKETING ANALYTICS
Interpretation of the Output

If we were to choose the 2-dimensional solution instead of the 3-


dimensional one, it could be plotted on a graph and would be visually
easier to interpret. Just as an illustration, we will do it for this example.
The plot of the 2-dimensional solution is shown in fig. 5 and the brands
can be seen to form distinct clusters based on their perceived similarity.

95
MARKETING ANALYTICS
2 Dimensional Output

DIMENSION
7 1.5
2

1.0

0.5 1

-1.5 -1.0 -0.5 6

0.5 1.0 1.5


-0.5 DIMENSION
8
4 3 5
-1.0

BRANDS : 1 = AIWA 5 = SONY


2 = VIDEOCON 6 = ONIDA
3 = LG 7 = THOMSON 96
4 = SAMSUNG 8 = BPL
MARKETING ANALYTICS
Interpretation of the Output

For example, brands 1 and 6 are perceived to be similar, whereas brand 5


is a standalone brand. So is brand 3, to some extent. Here again,
knowledge of the brand names and their attributes or qualities would be
used to name the two dimensions. Again, dimension 1 could be value for
money. Dimension 2 could be after-sales service. But notice that we are
losing some information on the third dimension which we had called
brand image in the 3-dimensional solution. The loss of information may
turn out to be critical in some cases.

97
MARKETING ANALYTICS
Additional Comments

1. MDS can be performed even with a sample size of 1.

2. It can be used to get a composite picture of a segment's


perception, by combining the responses of any one segment, and
repeating the MDS for each of the major segments.

3. It can also be done across all segments (a single MDS) by


aggregating responses for the entire sample.

4. It would be tempting to do one MDS for each respondent, but


the analysis would remain meaningless unless there are sufficient
numbers of each consumer type which means determining the
segments after the MDS. This is a possibility, but would involve a lot
of work in the analysis stage.

5. It is best left to the judgment of the researcher which approach


he would like to follow.
98
THANK YOU

Brijesh Singh
Department of Management Studies
[email protected]

99

You might also like