Re: [R] kmeans and incom,plete distance matrix concern

From: Ffenics <>
Date: Tue 08 Aug 2006 - 01:43:10 EST

I still don't quite understand. I thought kmeans algorithm went something like this:

Iterate until stable :
Determine the centroid coordinate

Determine the distance of each object to the centroids

Group the object based on minimum distance

         So, why do I not want a distance matrix?

Christian Hennig <> wrote: On Mon, 7 Aug 2006, Ffenics wrote:

> well then i dont understand because everything i have read so far suggests that you use the dist() function to create a matrix based on the euclideam distance and then the kmeans() function.

kmeans requires a data matrix where cases are rows and variables are columns. (If you understand what kmeans does, you should know why - means can't be computed from distances.)

I'm not sure about the NA behaviour. I guess NAs produce an error? (Try it ou!)
Anyway, I'd think about casewise deletion or imputation if I had to run kmeans on data with missing values.

        [[alternative HTML version deleted]] mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Tue Aug 08 01:53:11 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 08 Aug 2006 - 02:21:44 EST.

Mailing list information is available at Please read the posting guide before posting to the list.