chapter20.ppt 614KB Mar 03 2002 08:06:10 PM

Chapter XX
Cluster Analysis

Chapter Outline
1) Overview
2) Basic Concept
3) Statistics Associated with Cluster Analysis
4) Conducting Cluster Analysis
i. Formulating the Problem
ii. Selecting a Distance or Similarity Measure
iii. Selecting a Clustering Procedure
iv. Deciding on the Number of Clusters
v. Interpreting and Profiling the Clusters
vi. Assessing Reliability and Validity

5) Applications of Nonhierarchical Clustering
6) Clustering Variables
7) Internet & Computer Applications
8) Focus on Burke
9) Summary
10) Key Terms and Concepts

11) Acronyms

An Ideal Clustering Situation

Variable 1

Figure 20.1

Variable 2

A Practical Clustering Situation

Variable 1

Figure 20.2

Variable 2

X


Fig. 20.3

Conducting Cluster Analysis
Formulate the Problem
Select a Distance Measure
Select a Clustering Procedure
Decide on the Number of Clusters
Interpret and Profile Clusters
Assess the Validity of Clustering

Table 20.1

Attitudinal Data For Clustering

Case No.

V1

V2


V3

V4

V5

V6

1
2
3
4
5
6
7
8
9
10
11
12

13
14
15
16
17
18
19

6
2
7
4
1
6
5
7
2
3
1
5

2
4
6
3
4
3
4

4
3
2
6
3
4
3
3
4
5
3
4

2
6
5
5
4
7
6

7
1
6
4
2
6
6
7
3
3
2
5

1
4
4
4
7
2
3

3
4
4
5
2
3
3
4
3
6
3
4

5
6
2
6
2
6
7

2
5
1
3
6
3
3
1
6
4
5
2

4
4
1
4
2
4
2

3
4
3
6
4
4
4
4
3
6
3
4

4
7
4
7
5
3
7

Fig. 20.4

A Classification of Clustering Procedures
Clustering Procedures

Nonhierarchical

Hierarchical

Agglomerative

Divisive

Sequential
Threshold

Linkage
Methods

Variance
Methods

Parallel
Threshold
Centroid
Methods

Ward’s Method
Single

Complete

Average

Optimizing
Partitioning

Figure 20.5

Linkage Methods of Clustering
Single Linkage
Minimum Distance
Cluster 1

Complete Linkage

Cluster 2

Maximum Distance

Cluster 1

Average Linkage

Cluster 2

Average Distance
Cluster 1

Cluster 2

Other Agglomerative Clustering Methods
Fig. 20.6
Fig. 20.6

Ward’s Procedure

Centroid Method

Fig. 20.7
1+

Vertical Icicle Plot Using Ward’s Method

2+
3+

Number of Clusters

4+
5+
6+
7+

8+

9+
10+
11+
12+
13+
14+

15+
16+
17+
18+
19+
1
8

1
9

1
6

1
4

1
0

4

2
0

9

1
1

5

1
3

2

8

3

Case Label and Number

1
5

1
7

1
2

7

6

1

Table 20.2

Results of Hierarchical Clustering

Agglomeration Schedule Using Ward’s Procedure
Clusters combined
Stage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Cluster 1 Cluster 2 Coefficient
14
16
1.000000
2
13
2.500000
7
12
4.000000
5
11
5.500000
3
8
7.000000
1
6
8.500000
10
14
10.166667
9
20
12.666667
4
10
15.250000
1
7
18.250000
5
9
22.750000
4
19
27.500000
1
17
32.700001
1
15
40.500000
2
5
51.000000
1
3
63.125000
4
18
78.291664
2
4
171.291656
1
2
330.450012

Stage cluster
first appears

Cluster 1 Cluster 2 Next stage
0
0
7
0
0
15
0
0
10
0
0
11
0
0
16
0
0
10
0
1
9
0
0
11
0
7
12
6
3
13
4
8
15
9
0
17
10
0
14
13
0
16
2
11
18
14
5
19
12
0
18
15
17
19
16
18
0

Cluster Membership of Cases Using Ward’s Procedure
Number of Clusters
Label case

4

3

2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

1
2
1
3
2
1
1
1
2
3
2
1
2
3
1
3
1
4
3
2

1
2
1
3
2
1
1
1
2
3
2
1
2
3
1
3
1
3
3
2

1
2
1
2
2
1
1
1
2
2
2
1
2
2
1
2
1
2
2
2

Table 20.2 Contd.

Fig. 20.8
14
16

Dandogram Using Ward’s Method

10
4
19
18
2
13
5
11
9
20
3
8
7
12
1
6
17
15

Case Label Seq 0

5

10

15

20

Rescaled Distance Cluster Combine

25

Cluster Centroids

Table 20.3

Means of Variables
Cluster No.

V1

V2

V3

V4

V5

V6

1

5.750

3.625

6.000

3.125

1.750

3.875

2

1.667

3.000

1.833

3.500

5.500

3.333

3

3.500

5.833

3.333

6.000

3.500

6.000

Table 20.4

Results of Nonhierarchical Clustering

Initial Cluster Centers
Cluster
1
2
3

V1
4.0000
2.0000
7.0000

V2
6.0000
3.0000
2.0000

Classification Cluster Centers
Cluster
1
2
3

V1
3.8135
1.8507
6.3558

V2
5.8992
3.0234
2.8356

V3
3.0000
2.0000
6.0000

V4
7.0000
4.0000
4.0000

V5
2.0000
7.0000
1.0000

V6
7.0000
2.0000
3.0000

V3
3.2522
1.8327
6.1576

V4
6.4891
3.7864
3.6736

V5
2.5149
6.4436
1.3047

V6
6.6957
2.5056
3.2010

Cluster
2
1
3
3
1
3
1
1
1
2

Distance
2.254
1.882
2.340
1.410
2.112
2.400
1.772
2.137
4.421
0.813

Case Listing of Cluster Membership
Case ID
1
3
5
7
9
11
13
15
17
19

Cluster
3
3
2
3
2
2
2
3
3
1

Distance
1.780
1.174
2.525
1.862
1.843
1.923
3.382
3.605
3.760
0.853

Case ID
2
4
6
8
10
12
14
16
18
20

Table 20.4 contd.

Final Cluster Centers
Cluster
1
2
3

V1
3.5000
1.6667
5.7500

Cluster
1
2
3

1
0.0000
5.5678
5.7353

V2
5.8333
3.0000
3.6250

V3
3.3333
1.8333
6.0000

V4
6.0000
3.5000
3.1250

Distances between Final Cluster Centers

Analysis of Variance
Variable
V1
V2
V3
V4
V5
V6

Cluster MS
29.1083
13.5458
31.3917
15.7125
24.1500
12.1708

2

0.0000
Error MS
0.6078
0.6299
0.8333
0.7279
0.7353
1.0711

Number of Cases in each Cluster

Cluster
1
2
3
Missing
Total

V6
6.0000
3.3333
3.8750

3

0.0000
6.9944
df
2
2
2
2
2
2

V5
3.5000
5.5000
1.7500

Unweighted Cases
6
6
8
0
20

df
17
17
17
17
17
17

F
47.8879
21.5047
37.6700
21.5848
32.8440
11.3632

Weighted Cases
6
6
8
20

p
.000
.000
.000
.000
.000
.001

RIP 20.1

Perceived Product Parity - Once
Rarity - Now Reality

How do consumers in different countries perceive brands in
different product categories? Surprisingly, the answer is that the
product perception parity rate is quite high. Perceived product
parity means that consumers perceive all/most of the brands in a
product category as similar to each other or at par. A new study
by BBDO Worldwide shows that two-thirds of consumers
surveyed in 28 countries considered brands in 13 product
categories to be at parity. The product categories ranged from
airlines to credit cards to coffee.

Perceived parity averaged 63% for all
categories in all countries. The Japanese
have the highest perception of parity
across all product categories at 99% and
Colombians the lowest at 28%. Viewed by
product category, credit cards have the
highest parity perception at 76% and
cigarettes the lowest at 52%.
BBDO clustered the countries based on
product parity perceptions to arrive at
clusters that exhibited similar levels and
patterns of parity perceptions.

RIP 20.1 Contd.

The highest perception parity figure came from Asia/Pacific region
(83%) which included countries of Australia, Japan, Malaysia, and
South Korea, and also France. It is no surprise that France was in this
list since for most products they use highly emotional, visual
advertising that is feelings oriented. The next cluster was U.S.influenced markets (65%) which included Argentina, Canada, Hong
Kong, Kuwait, Mexico, Singapore, and the U.S. The third cluster,
primarily European countries (60%) included Austria, Belgium,
Denmark, Italy, the Netherlands, South Africa, Spain, the U.K., and
Germany.

What all this means is that in order to
differentiate the product/brand,
advertising can not just focus on product
performance, but also must relate the
product to the person's life in an
important way. Also, much greater
marketing effort will be required in the
Asia/Pacific region and in France in
order to differentiate the brand from
competition and establish a unique
image. A big factor in this growing parity
is of course the emergence of the global
market.

RIP 20.2

Clustering Marketing Professionals
Based on Ethical Evaluations

Cluster analysis can be used to explain differences in ethical
perceptions by using a large multi-item, multi-dimensional scale
developed to measure how ethical different situations are. One
such scale was developed by Reidenbach and Robin. This scale
has 29 items which compose five dimensions that measure how a
respondent judges a certain action. For example, a given
respondent will read about a marketing researcher that has
provided proprietary information of one of his clients to a second
client. The respondent is then asked complete the 29 item ethics
scale. For example, to indicate if this action is:
Just :___:___:___:___:___:___:___: Unjust
Traditionally :___:___:___:___:___:___:___: Unacceptable
acceptable
Violates :___:___:___:___:___:___:___: Does not violate an
unwritten contract

RIP 20.2 Contd.

This scale could be administered to a sample of marketing
professionals. By clustering respondents based on these 29 items,
two important questions should be investigated. First, how do the
clusters differ with respect to the five ethical dimensions; in this
case, Justice, Relativist, Egoism, Utilitarianism, Deontology (see
Chapter 24). Second, what types of firms compose each cluster?
The clusters could be described in terms of industry classification
(SIC), firm size, and firm profitability. Answers to these two
questions should provide insight into what type of firms use what
dimensions to evaluate ethical situations. For instance, do large
firms fall in to a different cluster than small firms? Do more
profitable firms perceive questionable situations more acceptable
than less-profitable firms?