Re: [R] distance in the function kmeans

About this list Date view Thread view Subject view Author view Attachment view

From: Thomas Petzoldt (thpe@hhbio.wasser.tu-dresden.de)
Date: Fri 28 May 2004 - 22:47:10 EST


Message-id: <40B734CE.6060903@hhbio.wasser.tu-dresden.de>

n.bouget@laposte.net wrote:

> I don't exactly understand what you do, could you show me the
> program that you execute to do that?

I did such things sometimes ago, so the following is (as usual) without
warranty. There are several methods, e.g. using Choleski factorization,
singular value decomposition or principal components. Given "mdata" as
original data matrix it works with hclust and should be applicable to
kmeans too:

# with svd
z <- svd(scale(mdata, scale=F))$u
cl <- hclust(dist(z), method="ward")

# with princomp (rescaled)
pc <- princomp(mdata, cor=FALSE)
pcdata <- as.data.frame(scale(pc$scores))
cl <- hclust(dist(pcdata), method="ward")

... but as I mentioned, this is only an example, that methods working
with the Euclidean distance can be applied to other distance measures,
when an appropriate transformation of the data exist and, according to
Gavin, there are indeed some other possibilities.

Thomas P.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:13 EST