The Clustergram
A graph for visualizing hierarchical and non-hierarchical cluster analyses
In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named
clustergram
to examine how cluster members are assigned to clusters as the number of clusters increases. This graph is useful in exploratory analysis for non-hierarchical clustering algorithms like k-means and for hierarchical cluster algorithms when the number of observations is large enough to make dendrograms impractical.
The clustergram is currently implemented in Stata and R.
Downloads:
-
(pdf)
Schonlau M. The clustergram: a graph for visualizing hierarchical and non-hierarchical cluster analyses. The Stata Journal, 2002; 2 (4):391-402.
The paper introduces the clustergram and explains how to use the stata ado files.
-
Schonlau M. Visualizing Hierarchical and Non-Hierarchical Cluster Analyses with Clustergrams. Computational Statistics: 2004; 19(1):95-111.
(pdf)
This paper points out that the y-axis in the clustergram may take different functions and gives further examples.
-
Stata implementation clustergram ZIP File
Stata implementation
The ZIP file with the stata implementation contains the following stata programs :
- clustergram ado file
-
clustergram Help File
-
clustervar Ado file
This supplementary ado file makes it easier to run various cluster algorithms.
The stata paper describes how to run cluster analysis without using this supplementary ado file.
-
clustervar help file
Syntax Example:
clustervar sepallen-petalwid, algorithm(singlelinkage) max(`max') distance(L2)
-
Asbestos.dta (stata data set)
Asbestos data set used in the paper
R implementation
Tal Galili has implemented the clustergram in R. He gives the code and talks about it in his blog:
Clustergram blog as part of www.r-statistics.com
Return to Home Page
Remove navigation bar on the left