From: Rolf Turner <r.turner_at_auckland.ac.nz>

Date: Fri, 30 May 2008 12:33:13 +1200

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 30 May 2008 - 00:36:18 GMT

Date: Fri, 30 May 2008 12:33:13 +1200

I have been attempting to do some work using hclust, and have run into a (possibly subtle) problem.

The background is that I constructed a dissimilarity matrix ``d1''

(it involved something called the ``Jaccard similarity coefficient'';

I won't go

into the details unless requested). I then did

d2 <- as.dist(d1) try <- hclust(d2,method=ward) plot(try,labels=FALSE)

After looking at the plot, I tried

mmm <- cutree(try,h=7)

and got the error message

Error in cutree(try, h = 7) :

the 'height' component of 'tree' is not sorted

(increasingly); consider applying as.hclust() first

I was much puzzled by this initially, since try is already an
``hclust'' object

(I checked class(try)) but after a substantial amount of hair-tearing

I discovered

that the entries of the height component of try are constant over
long stretches.

E.g. the first 54 entries are 0 (to the 7 printed decimal places).
This doesn't

*seem* to be cause for alarm --- the help says explicitly that height
is a

*non-decreasing* sequence (but not necessarily a strictly increasing
one).

I checked

with(try,all.equal(height,sort(height))

and got

**[1] TRUE
**
but order(try$height) is NOT equal to 1:745 (note that 746 is the
number of subjects

in the data set).

I have done an RSiteSearch() on "cutree" and turned up nothing that seemed relevant.

Finally, I found that if I do

try$height <- round(try$height,6)

then

mmm <- cutree(try,h=7)

``works'' (without error).

Are there traps for young players in employing such a strategy? What
should I

really worry about?

If anyone wants to try it for themselves with the real distance
matrix, I can bundle

it up and email it to them privately.

Thanks for any insights.

cheers,

Rolf Turner

######################################################################Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 30 May 2008 - 00:36:18 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 30 May 2008 - 02:30:45 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*