From: Karin Lagesen <karinlag_at_studmed.uio.no>

Date: Sat, 31 May 2008 20:03:53 +0200

ll = as.matrix(rbind(a,b,c,d))

test = as.dist(ll)

long = hclust(test)

test = as.dist(ll)

short = hclust(test)

Date: Sat, 31 May 2008 20:03:53 +0200

I have two examples that I run hclust on:

a = c(0,1,1.5,1.5) b = c(1,0,1.5,1.5) c = c(1.5,1.5,0,0.5) d = c(1.5,1.5,0.5,0)

ll = as.matrix(rbind(a,b,c,d))

test = as.dist(ll)

long = hclust(test)

a = c(0,0.3,1,1) b = c(0.3,0,1,1) c = c(1,1,0,0.5) d = c(1,1,0.5,0) ll = as.matrix(rbind(a,b,c,d))

test = as.dist(ll)

short = hclust(test)

The main difference between them is whether a and b gets clustered higher up or lower down than the b,c cluster.

I am working on partitioning this kind of data into three clusters. I know I can do that with cutree. The result I get from that is the following:

> cutree(short, k=3)

a b c d

1 1 2 3

> cutree(long, k=3)

a b c d

1 2 3 3

*>
*

And I can also access the height matrix for both:

> short$height

[1] 0.3 0.5 1.0

> long$height

[1] 0.5 1.0 1.5

*>
*

So I know at what heights they get merged.

What I seem to be unable to get at is which one of the clusters as shown by cutree correspond to what split. When I examine short in a plot I can easily see that the highest split (i.e corresponding to the last height, 1, in the height matrix) is between the cutree clusters 1 and 2,3. In the long example this split is between 1,2 and 3. I would however like to not examine all of the data I have by hand:)

Could any of you point me to what I need to do to get at this data? I have tried to examine the merge data in both cases, but I am coming up short.

Thanks!

Karin

-- Karin Lagesen, PhD student karin.lagesen_at_medisin.uio.no http://folk.uio.no/karinlag ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Mon 02 Jun 2008 - 02:16:14 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 02 Jun 2008 - 02:30:36 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*