Re: [R] Cluster on both categorical and numerical data

From: paulandpen <paulandpen_at_optusnet.com.au>
Date: Thu, 19 Jun 2008 05:58:09 +1000

okay,

when you cluster information, you can have two inputs

raw data information which the algorithms converts have into a matrix and then processes

a pre-processed matrix which you create yourself to input into a package

essentially, packages will have a default assumption about the data you are using or the type of matrix you are using

these matrices are often defined in simplistic terms as either a similarity or dissimilarity matrix

think of a correlation matrix as an example of a matrix which represents similarity

i think you will need to create a dissimilarity matrix (think of something that is like a correlation matrix which measures similarity in the diagonals) and it is the opposite of this (technically not correct, but you get the idea I hope)

i use clustan graphics for all my clustering needs and gower's coefficient is the input i use when i have mixed variables

if you pre-process (create a dissimilarity matrix) using Gowers algorithm, then specify this everything should work fine

once you get this sorted, it should be all straight-forward

PD

>
> Hello there. Is there any function in R that can do cluster on a set of
> data that has both categorical and numerical variables? thanks.
> siangli
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 18 Jun 2008 - 20:14:39 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 18 Jun 2008 - 21:31:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive