[R] indexing into and modifying dendrograms

From: Jenny Bryan <jenny_at_stat.ubc.ca>
Date: Tue 12 Jul 2005 - 04:48:32 EST


I would like to be able to exert certain types of control over the plotting of dendrograms (representing hierarchical clusterings) that I think is best achieved by modifying the dendrogram object prior to plotting. I am using the "dendrogram" class and associated methods.

Define the cluster number of each cluster formed as the corresponding row of the merge object. So, if you are clustering m objects, the cluster numbers range from 1 to m-1, with cluster m-1 containing all m objects, by definition. I basically want a way to index into an object of class dendrogram using the above definition of cluster number and/or to act on a dendrogram, where I specify the target node using cluster number.

The first application would be to 'flip' the two elements in target node of the dendrogram (made clear in the small example below). (The setting is genomics and I have applications where I want to man-handle my dendrograms to make certain features of the clustering more obvious to the naked eye.) I could imagine other, related actions that would be useful in decorating dendrograms.

I think I need a function that takes a dendrogram and cluster number(s) as input and returns the relevant part(s) of the dendrogram object -- but in a form that makes it easy to then, say, set certain attributes (perhaps recursively) for the target nodes (and perhaps those contained in it). I'm including a small example below that hopefully illustrates this (it looks long, but it's mostly comments!).

Any help would be appreciated.

Jenny Bryan

## get ready for step-by-step figures

par(mfrow = c(2,2))

## get 5 objects, with 2-dimensional features
pts <- rbind(c(2,1.6),

              c(1.8,2.4),
              c(2.1, 2.7),
              c(5,2.6),
              c(4.7,3.1))
plot(pts, xlim = c(0,6), ylim = c(0,4),type = "n",
      xlab = "Feature 1", ylab = "Feature 2")
points(pts,pch = as.character(1:5))

## build a hierarhical tree, store as a dendrogram
aggTree <- hclust(dist(pts), method = "single") (dend1 <- JB.as.dendrogram.hclust(aggTree))
## NOTE: only thing I added to official version of
## as.dendrogram.hclust:
## each node has an attribute cNum, which gives
## the merge step at which it was formed,
## i.e. gives the row of the merge object which
## describes the formation of that node
## one new line near end of nMerge loop:
## ***************
## *** 51,56 ****
## --- 51,60 ----
## attr(z[[x[2]]], "midpoint"))/2
## }
## attr(zk, "height") <- oHgt[k]
## +
## + ## JB added July 6 2005
## + attr(zk, "cNum") <- k
## +
## z[[k <- as.character(k)]] <- zk
## }
## z <- z[[k]]

attributes(dend1)
attributes(dend1[[1]])
## here's a table relating dend1 and the cNum attribute
## dend1 cNum
## -------------------------
## dend1 4
## dend1[[1]] 2
## dend1[[2]] 3
## dend1[[2]][[1]] <not set>
## dend1[[2]][[1]] 1

## use cNum attribute in "edgetext"
## following example in dendrogram documentation
## would really rather associate with the node than the edge
## but current plotting function has no notion of nodetext
addE <- function(n) {

   if(!is.leaf(n)) {

     attr(n, "edgePar") <- list(p.col="plum")
     attr(n, "edgetext") <- attr(n,"cNum")
   }
   n
}
dend2 <- dendrapply(dend1, addE)
## overlays the cNum ("cluster number") attribute on dendrogram
plot(dend2, main = "dend2")
## why does no plum polygon appear around the '4' for the root
## edge?

## swap order of clusters 2 and 3,
## i.e. 'flip' cluster 4

dend3 <- dend2
dend3[[1]] <- dend2[[2]]
dend3[[2]] <- dend2[[1]]
plot(dend3, main = "dend3")
## wish I could achieve with 'dend3 <- flip(dend2, cNum = 4)

## swap order of cluster 1 and object 1,
## i.e. 'flip' cluster 3

dend4 <- dend2
dend4[[2]][[1]] <- dend2[[2]][[2]]
dend4[[2]][[2]] <- dend2[[2]][[1]]
plot(dend4, main = "dend4")
## wish I could achieve with 'dend4 <- flip(dend2, cNum = 3)

## finally, it's clear that the midpoint attribute would also
## need to be modified by 'flip'



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jul 12 04:54:02 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:30 EST