Re: [R] Cluster Analysis - Number of Clusters

From: Christian Hennig <>
Date: Tue 07 Feb 2006 - 00:38:36 EST


as said before, some statistics to estimate the number of clusters are in the cluster.stats function of package fpc. These are distance-based, not "pseudo F or T^2". They are documented in the book of Gordon (1999) Classification (see ?cluster.stats for more references). It also includes the average silhouette width of Kaufman and Rousseeuw (1990) (exact reference in ?plot.agnes), which is also part of the output of some functions in package cluster (pam, agnes,...?).

An alternative way to estimate the number of clusters is the use of the BIC together with a (normal) mixture model, see package mclust.


On Sun, 5 Feb 2006, John Janmaat wrote:

> Hello,
> I'm playing around with cluster analysis, and am looking for methods to
> select the number of clusters. I am aware of methods based on a 'pseudo
> F' or a 'pseudo T^2'. Are there packages in R that will generate these
> statistics, and/or other statistics to aid in cluster number selection?
> Thanks,
> John.
> --
> ===========================================================================
> Dr. John Janmaat Tel: 902-585-1461
> Department of Economics Fax: 902-585-1070
> Acadia University Email:
> Wolfville, Nova Scotia, Canada. Web:
> ______________________________________________
> mailing list
> PLEASE do read the posting guide!
> mailing list PLEASE do read the posting guide! Received on Tue Feb 07 00:46:38 2006

This archive was generated by hypermail 2.1.8 : Tue 07 Feb 2006 - 08:36:04 EST