Re: [R] Help in kmeans

From: raji sankaran <raji.sankaran_at_gmail.com>
Date: Thu, 07 Apr 2011 04:02:06 +0530

Hi,

  Thanks for the information.But , i am already using set.seed().My problem is that, when i use column names instead of column indices, the result seems to be less accurate consistently.Hence, we wanted to understand how kmeans differentiates between column names and column indices. Is there any way we can bridge the gap so that we get the same result for column names and column indices?

Regards,
Raji

On Wed, Apr 6, 2011 at 5:30 PM, Christian Hennig <chrish_at_stats.ucl.ac.uk>wrote:

> I'm not going to comment on column names, but this is just to make you
> aware that the results of k-means depend on random initialisation.
>
> This means that it is possible that you get different results if you run it
> several times. It basically gives you a local optimum and there may be more
> than one of these.
> Use set.seed to see whether this explains your problem.
>
> Best regards,
> Christian
>
>
> On Wed, 6 Apr 2011, Raji wrote:
>
> Hi All,
>>
>> I was using the following command for performing kmeans for Iris dataset.
>>
>> Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3)
>>
>> This was giving proper results for me. But, in my application we generate
>> the R commands dynamically and there was a requirement that the column
>> names
>> will be sent instead of column indices to the R commands.Hence, to
>> incorporate this, i tried using the R commands in the following way.
>>
>>
>> kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3)
>>
>> or
>>
>>
>> kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3)
>>
>> In both the ways, we found that the results are different from what we saw
>> with the first command (with column indices).
>>
>> can you please let us know what is going wrong here.If so, can you please
>> let us know how the column names can be used in kmeans to obtain the
>> correct
>> results?
>>
>> Many thanks,
>> Raji
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> *** --- ***
> Christian Hennig
> University College London, Department of Statistical Science
> Gower St., London WC1E 6BT, phone +44 207 679 1698
> chrish_at_stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 06 Apr 2011 - 22:34:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 06 Apr 2011 - 23:00:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive