[R] Colouring hclust() trees

About this list Date view Thread view Subject view Author view Attachment view

From: Richard A. O'Keefe (ok@cs.otago.ac.nz)
Date: Mon 10 May 2004 - 13:29:26 EST


Message-id: <200405100329.i4A3TQhx204497@atlas.otago.ac.nz>

I have a data set with 6 variables and 251 cases.
The people who supplied me with this data set believe that it falls
naturally into three groups, and have given me a rule for determining
group number from these 6 variables.

If I do
    scaled.stuff <- scale(stuff, TRUE, c(...the design ranges...))
    stuff.dist <- dist(scaled.stuff)
    stuff.hc <- hclust(stuff.dist)
    plot(stuff.hc)
I get a dendrogram which looks sort of plausible, but

(a) with this many leaves, the leaf labels really aren't legible at any
    plausible scaling, and would be best omitted. I could figure out
    which point was which if there were some way to use identify(), but
    I'm justnot seeing it.

(b) what I'd really like to do is to colour the leaves according to the
    predicted group, or some other variable. The obvious thing to try is
    plot(stuff.hc, col=c("red","green","blue")[stuff.predicted.group])
    but that doesn't work. I read everything that seemed plausible, and
    came across nodePar, but

    col <- c("red","green","blue")[stuff.predicted.group]
    plot(stuff.hc, nodePar=list(col=list("black",col)))

    tells me repeatedly that

    parameter "nodePar" couldn't be set in high-level plot() function

    while

    plot(as.dendrogram(hc), nodePar=list(col=list("black",col)))

    draws the dendrogram (_much_ slower than plot() does) and still gives
    me no colouring at all. Clearly I have misunderstood how to use
    nodePar.

(c) The obvious fall-back is to use points() to draw the nodes again in
    the colours I want, but if I could do that, I could use identify().

The frustrating thing is that when I do

    d <- dim(stuff))[1]
    plot(1:d, 1:d, col=col[stuff.hc$order])

shows me that there _is_ a strong connection between the groups found by
hclust() and the predicted groups, albeit not a simple one.

I have looked at plot.dendrogram() and plotNode() -- using getAnywhere() --
and it looks to me as though what I want *should* be doable, but I've
clearly misunderstood the details of how to do it.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:08 EST