[R] k-means: should columns in dataset be in same scale?

From: Johan Jackson <johan.h.jackson_at_gmail.com>
Date: Tue, 22 Apr 2008 18:26:50 -0600


Hi all,

Simple question re k-means. If I have a data set with columns that are on different scales (say col 1 has var=100 and col2 var=2), will this make a difference to the k-means algorithm? It seems as though it does. If so, should we first standardize the columns of the dataset so that each column is given equal weight?

JJ

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 23 Apr 2008 - 00:29:25 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 23 Apr 2008 - 07:30:31 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive