From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Sat, 22 Dec 2007 11:33:02 -0500

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 22 Dec 2007 - 16:40:17 GMT

Date: Sat, 22 Dec 2007 11:33:02 -0500

and if you only want to retain groups with 2+ elements then you can just Filter then out:

twoplus <- function(x) length(x) >= 2

Filter(twoplus, split(seq_along(v), ct))

On Dec 22, 2007 5:12 AM, Johannes Graumann <johannes_graumann_at_web.de> wrote:

> But cutree does away with the indexes from the original input, which

*> rect.hclust retains.
**> I will have no other choice and match that input with the 'values' contained
**> in the clusters ...
**>
**> Joh
**>
**>
**> Gabor Grothendieck wrote:
**>
**> > If we don't need any plotting we don't really need rect.hclust at
**> > all. Split the output of cutree, instead. Continuing from the
**> > prior code:
**> >
**> >> for(el in split(unname(vv), names(vv))) print(el)
**> > [1] 0.00 0.45
**> > [1] 1
**> > [1] 2
**> > [1] 3.00 3.25 3.33 3.75 4.10
**> > [1] 5
**> > [1] 6.00 6.45
**> > [1] 7.0 7.1
**> > [1] 8
**> >
**> > On Dec 21, 2007 3:24 PM, Johannes Graumann <johannes_graumann_at_web.de>
**> > wrote:
**> >> Hm, hm, rect.hclust doesn't accept "plot=FALSE" and cutree doesn't retain
**> >> the indexes of membership ... anyway short of ripping out the guts of
**> >> rect.hclust to achieve the same result without an active graphics device?
**> >>
**> >> Joh
**> >>
**> >>
**> >> >> # cluster and plot
**> >> >> hc <- hclust(dist(v), method = "single")
**> >> >> plot(hc, lab = v)
**> >> >> cl <- rect.hclust(hc, h = .5, border = "red")
**> >> >>
**> >> >> # each component of list cl is one cluster. Print them out.
**> >> >> for(idx in cl) print(unname(v[idx]))
**> >> > [1] 8
**> >> > [1] 7.0 7.1
**> >> > [1] 6.00 6.45
**> >> > [1] 5
**> >> > [1] 3.00 3.25 3.33 3.75 4.10
**> >> > [1] 2
**> >> > [1] 1
**> >> > [1] 0.00 0.45
**> >> >
**> >> >> # a different representation of the clusters
**> >> >> vv <- v
**> >> >> names(vv) <- ct <- cutree(hc, h = .5)
**> >> >> vv
**> >> > 1 1 2 3 4 4 4 4 4 5 6 6 7 7
**> >> > 8
**> >> > 0.00 0.45 1.00 2.00 3.00 3.25 3.33 3.75 4.10 5.00 6.00 6.45 7.00 7.10
**> >> > 8.00
**> >> >
**> >> >
**> >> > On Dec 21, 2007 4:56 AM, Johannes Graumann <johannes_graumann_at_web.de>
**> >> > wrote:
**> >> >> <posted & mailed>
**> >> >>
**> >> >> Dear all,
**> >> >>
**> >> >> I'm trying to solve the problem, of how to find clusters of values in
**> >> >> a vector that are closer than a given value. Illustrated this might
**> >> >> look as follows:
**> >> >>
**> >> >> vector <- c(0,0.45,1,2,3,3.25,3.33,3.75,4.1,5,6,6.45,7,7.1,8)
**> >> >>
**> >> >> When using '0.5' as the proximity requirement, the following groups
**> >> >> would result:
**> >> >> 0,0.45
**> >> >> 3,3.25,3.33,3.75,4.1
**> >> >> 6,6.45
**> >> >> 7,7.1
**> >> >>
**> >> >> Jim Holtman proposed a very elegant solution in
**> >> >> http://tolstoy.newcastle.edu.au/R/e2/help/07/07/21286.html, which I
**> >> >> have modified and perused since he wrote it to me. The beauty of this
**> >> >> approach is that it will not only work for constant proximity
**> >> >> requirements as above, but also for overlap-windows defined in terms
**> >> >> of ppm around each value. Now I have an additional need and have found
**> >> >> no way (short of iteratively step through all the groups returned) to
**> >> >> figure out how to do that with Jim's approach: how to figure out that
**> >> >> 6,6.45 and 7,7.1 are separate clusters?
**> >> >>
**> >> >> Thanks for any hints, Joh
**> >> >>
**> >
**>
**> > ______________________________________________
**> > R-help_at_r-project.org mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide
**> > http://www.R-project.org/posting-guide.html and provide commented,
**> > minimal, self-contained, reproducible code.
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 22 Dec 2007 - 16:40:17 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 22 Dec 2007 - 21:30:21 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*