[R] clustering problem

From: Karin Lagesen <karin.lagesen_at_medisin.uio.no>
Date: Wed, 20 Feb 2008 11:10:30 +0100

First I just want to say thanks for all the help I've had from the list so far..)

I now have what I think is a clustering problem. I have lots of objects which I have measured a dissimilarity between. Now, this list only has one entry per pair, so it is not symmetrical.

Example input:

NameA NameB Dist

189_1C2 189_1C1 0
189_1C3 189_1C1 0.017
189_1C3 189_1C2 0.017
189_1C4 189_1C1 0
189_1C4 189_1C2 0
189_1C4 189_1C3 0.017
189_1C5 189_1C1 0.05
189_1C5 189_1C2 0.05
189_1C5 189_1C3 0.067
189_1C5 189_1C4 0.05
189_1C6 189_1C1 0.05
189_1C6 189_1C2 0.05
189_1C6 189_1C3 0.067
189_1C6 189_1C4 0.05
189_1C6 189_1C5 0

The distance measure is 0 if identical, and then increases with increasing dissimilarity up till 1.

What I would like to get from these data is a hierarchical clustering graph. In this example I would then group

189_1C2 189_1C1 189_1C4,

189_1C6 189_1C5,

and 189_1C3 off with itself.

The distances between the groups should be the mean distances between the objects within each group (I think).

I have looked at hclust and it seems like it should be able to do what I want. However, I am unsure of how to use it to get what I am looking for.

Thankyou in advance for your help!


Karin Lagesen, PhD student

R-help_at_r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 20 Feb 2008 - 10:16:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Feb 2008 - 11:30:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive