[R] Optimum # of Clusters using Kmeans

From: RK <ravi.jain11_at_gmail.com>
Date: Sat 07 Apr 2007 - 17:44:06 GMT


Dear R Users,

I am doing clustering and just wondering
(1) whether is it possible to find optimum number of clusters using kmeans
just like PAM using silhouette width.

asw <- numeric(20)
for (k in 2:20)
 asw[k] <- pam(A, k) $ silinfo $ avg.width k.best <- which.max(asw)
cat("silhouette-optimal number of clusters:", k.best, "\n")

plot(1:20, asw, type= "h", main = "pam() clustering assessment",

    xlab= "k (# clusters)", ylab = "average silhouette width") axis(1, k.best, paste("best",k.best,sep="\n"), col = "red", col.axis ="red")

(2) Another thing regarding pre-processing data. I have mixed data( Nominal,
numeric categorical etc). Before clustering, i convert all the nominal data to binary and normlise them.
Is there any elegant way of doing this?

(3) Is there any function to nomlise data in R?

Thank you

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun Apr 08 03:46:50 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 07 Apr 2007 - 18:31:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.