[R] Silhouette width using K-means

From: RK <ravi.jain11_at_gmail.com>
Date: Wed 04 Apr 2007 - 09:04:19 GMT

I am doing clustering and I want to know how can i find Silhouette width using K-means. Just like PAM (code below). (2) Secondly, I have mixed data all sort of variables numeric, categorical, nominals so I first change to all nominal to binary and normlise the data before any clustering. Is there any other elegant way of doing this?
(3) another question how to normlise and change to binary (filters) in R?

Thank you in Advance.



## Use the silhouette widths for assessing the best number of clusters,
## following a one-dimensional example from Christian Hennig :
x <- c(rnorm(50), rnorm(50,mean=5), rnorm(30,mean=15)) asw <- numeric(20)
## Note that "k=1" won't work!

for (k in 2:20)
  asw[k] <- pam(x, k) $ silinfo $ avg.width k.best <- which.max(asw)
cat("silhouette-optimal number of clusters:", k.best, "\n")

plot(1:20, asw, type= "h", main = "pam() clustering assessment",

     xlab= "k (# clusters)", ylab = "average silhouette width") axis(1, k.best, paste("best",k.best,sep="\n"), col = "red", col.axis = "red")

        [[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Apr 04 19:07:36 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 04 Apr 2007 - 09:31:03 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.