Re: [R] Efficient way to find consecutive integers in vector?

From: Henrik Bengtsson <hb_at_stat.berkeley.edu>
Date: Fri, 21 Dec 2007 17:39:41 -0800

In the R.utils package there is seqToIntervals(), e.g.

print(seqToIntervals(1:10))
## from to
## [1,] 1 10
print(seqToIntervals(c(1:10, 15:18, 20)))

##      from to
## [1,]    1 10
## [2,]   15 18
## [3,]   20 20

There is also seqToIntervals(), which uses the above, e.g.

print(seqToHumanReadable(1:10))
## [1] "1-10"
print(seqToHumanReadable(c(1:10, 15:18, 20))) ## [1] "1-10, 15-18, 20"

/Henrik

On 21/12/2007, Tony Plate <tplate_at_acm.org> wrote:
> Martin Maechler wrote:
> >>>>>> "MS" == Marc Schwartz <marc_schwartz_at_comcast.net>
> >>>>>> on Thu, 20 Dec 2007 16:33:54 -0600 writes:
> >
> > MS> On Thu, 2007-12-20 at 22:43 +0100, Johannes Graumann wrote:
> > >> Hi all,
> > >>
> > >> Does anybody have a magic trick handy to isolate directly consecutive
> > >> integers from something like this:
> > >> c(1,2,3,4,7,8,9,10,12,13)
> > >>
> > >> The result should be, that groups 1-4, 7-10 and 12-13 are consecutive
> > >> integers ...
> > >>
> > >> Thanks for any hints, Joh
> >

> > MS> Not fully tested, but here is one possible approach:
> >
> > >> Vec
> > MS> [1] 1 2 3 4 7 8 9 10 12 13
> >
> > MS> Breaks <- c(0, which(diff(Vec) != 1), length(Vec))
> >
> > >> Breaks
> > MS> [1] 0 4 8 10
> >
> > >> sapply(seq(length(Breaks) - 1),
> > MS> function(i) Vec[(Breaks[i] + 1):Breaks[i+1]])
> > MS> [[1]]
> > MS> [1] 1 2 3 4
> >
> > MS> [[2]]
> > MS> [1] 7 8 9 10
> >
> > MS> [[3]]
> > MS> [1] 12 13
> >
> >
> >
> > MS> For a quick test, I tried it on another vector:
> >
> >
> > MS> set.seed(1)
> > MS> Vec <- sort(sample(20, 15))
> >
> > >> Vec
> > MS> [1] 1 2 3 4 5 6 8 9 10 11 14 15 16 19 20
> >
> > MS> Breaks <- c(0, which(diff(Vec) != 1), length(Vec))
> >
> > >> Breaks
> > MS> [1] 0 6 10 13 15
> >
> > >> sapply(seq(length(Breaks) - 1),
> > MS> function(i) Vec[(Breaks[i] + 1):Breaks[i+1]])
> > MS> [[1]]
> > MS> [1] 1 2 3 4 5 6
> >
> > MS> [[2]]
> > MS> [1] 8 9 10 11
> >
> > MS> [[3]]
> > MS> [1] 14 15 16
> >
> > MS> [[4]]
> > MS> [1] 19 20
> >
> > Seems ok, but ``only works for increasing sequences''.
> > More than 12 years ago, I had encountered the same problem and
> > solved it like this:
> >
> > In package 'sfsmisc', there has been the function inv.seq(),
> > named for "inversion of seq()",
> > which does this too, currently returning an expression,
> > but returning a call in the development version of sfsmisc:
> >
> > Its definition is currently
> >
> > inv.seq <- function(i) {
> > ## Purpose: 'Inverse seq': Return a short expression for the 'index' `i'
> > ## --------------------------------------------------------------------
> > ## Arguments: i: vector of (usually increasing) integers.
> > ## --------------------------------------------------------------------
> > ## Author: Martin Maechler, Date: 3 Oct 95, 18:08
> > ## --------------------------------------------------------------------
> > ## EXAMPLES: cat(rr <- inv.seq(c(3:12, 20:24, 27, 30:33)),"\n"); eval(rr)
> > ## r2 <- inv.seq(c(20:13, 3:12, -1:-4, 27, 30:31)); eval(r2); r2
> > li <- length(i <- as.integer(i))
> > if(li == 0) return(expression(NULL))
> > else if(li == 1) return(as.expression(i))
> > ##-- now have: length(i) >= 2
> > di1 <- abs(diff(i)) == 1 #-- those are just simple sequences n1:n2 !
> > s1 <- i[!c(FALSE,di1)] # beginnings
> > s2 <- i[!c(di1,FALSE)] # endings
> >
> > ## using text & parse {cheap and dirty} :
> > mkseq <- function(i,j) if(i == j) i else paste(i,":",j, sep="")
> > parse(text =
> > paste("c(", paste(mapply(mkseq, s1,s2), collapse = ","), ")", sep = ""),
> > srcfile = NULL)[[1]]
> > }
> >
> > with example code
> >
> > > v <- c(1:10,11,6,5,4,0,1)
> > > (iv <- inv.seq(v))
> > c(1:11, 6:4, 0:1)
> > > stopifnot(identical(eval(iv), as.integer(v)))
> > > iv[[2]]
> > 1:11
> > > str(iv)
> > language c(1:11, 6:4, 0:1)
> > > str(iv[[2]])
> > language 1:11
> > >
> >
> >
> > Now, given that this stems from 1995, I should be excused for
> > using parse(text = *) [see fortune(106) if you don't understand].
> >
> > However, doing this differently by constructing the resulting
> > language object directly {using substitute(), as.symbol(),
> > as.expression() ... etc}
> > seems not quite trivial.
> >
> > So here's the Friday afternoon / Christmas break quizz:
> >
> > What's the most elegant way
> > to replace the last statements in inv.seq()
> > ------------------------------------------------------------------------
> > ## using text & parse {cheap and dirty} :
> > mkseq <- function(i,j) if(i == j) i else paste(i,":",j, sep="")
> > parse(text =
> > paste("c(", paste(mapply(mkseq, s1,s2), collapse = ","), ")", sep = ""),
> > srcfile = NULL)[[1]]
> > ------------------------------------------------------------------------
> >
> > by code that does not use parse (or source() or similar) ???
> >
> > I don't have an answer yet, at least not at all an elegant one.
> > And maybe, the solution to the quiz is that there is no elegant
> > solution.

>

> How about this ? :
>

> > i <- c(1, 10, 12)
> > j <- c(5, 10, 14)
> > mkseq <- function(i, j) if (i==j) i else call(':', i, j)
> > as.call(c(list(as.name('c')), mapply(i, j, FUN=mkseq)))
> c(1:5, 10, 12:14)
> > eval(.Last.value)

> [1] 1 2 3 4 5 10 12 13 14
> >
>

> -- Tony Plate
>

> >
> > Martin
> >
> >
> > MS> HTH,
> >
> > MS> Marc Schwartz
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 22 Dec 2007 - 01:43:19 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 22 Dec 2007 - 02:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.