# Re: [R] grouping by consecutive integers

From: Berton Gunter <gunter.berton_at_gene.com>
Date: Tue 25 Jul 2006 - 03:27:45 EST

As you do not seem to have received what you consider to be satisfactory reply, here is a function that I **think** does what you want:

sequences <- function(x,incr = 1)
{

```	ix <- which(abs(diff(c(FALSE,diff(x) == 1))) ==incr)
if(length(ix)%%2)c(ix,length(x))
else ix
```

}

This function gives successive pairs of first and last values of sequences of increasing values within x that differ by incr. You can then process these pairs however you like either to summarize statistics on the indices and/or the values of the sequences.

Examples:
> sequences(c(1:5,50,3:7))

[1] 1 5 7 11
> sequences(c(10,1:5,50,3:7))

[1] 2 6 8 12
> sequences(c(1:5,50,3:7,10))

[1] 1 5 7 11
> sequences(c(10,1:5,50,3:7,10))

[1] 2 6 8 12

Cheers,

• Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning process." - George E. P. Box

> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Kevin J Emerson
> Sent: Monday, July 24, 2006 9:20 AM
> To: Niels Vestergaard Jensen
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] grouping by consecutive integers
>
> Let me clarify one thing that I dont think I made clear in my posting.
> I am looking for the max, min and median of the indicies, not of the
> time series frequency counts. I am looking to find the max, min, and
> median time of peaks in a time series, so i am looking for the
> information concerning that.
>
> so mostly my question is how to extract the information of
> max, min, and
> median of sequential numbers in a vector. I will reword my original
> posting below.
>
> > > Hello R-helpers!
> > >
> > > I have a question concerning extracting sequence
> information from a
> > > vector. I have a vector (representing the bins of a time
> series where
> > > the frequency of occurrences is greater than some
> threshold) where I
> > > would like to extract the min, median and max of each group of
> > > consecutive numbers in the index vector..
> > >
> > > For Example:
> > >
> > > tmp <-
> c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71)
> > >
> > > I would like to have the max,min,median of the following groups:
> > >
> > > 24,25 - max = 25, min = 24 median = 24.5
> > > 29 max=min=median = 29
> > > 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc...
> > > 68,69,70,71
> > >
> > > I would like to be able to perform this for many time series so an
> > > automated process would be nice. I am hoping to use this
> as a peak
> > > detection protocol.
> > >
> > > Any advice would be greatly appreciated,
> > > Kevin
> > >
> > > -----
> > > -----
> > > Kevin J Emerson
> > > Center for Ecology and Evolutionary Biology
> > > 1210 University of Oregon
> > > Eugene, OR 97403
> > > USA
> > > kemerson@uoregon.edu
> > >
> > > ______________________________________________
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help