From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Wed 23 Aug 2006 - 09:22:48 EST

}

do.call(rbind, by(DF, DF$id, f))

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Aug 23 09:26:17 2006

Date: Wed 23 Aug 2006 - 09:22:48 EST

Try this:

# data

DF <- structure(list(id = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4,
4, 4, 4, 4, 4, 4), date = structure(c(8, 9, 10, 11, 3, 7, 8,
10, 4, 1, 2, 3, 8, 9, 10, 11, 3, 5, 6), .Label = c("01/07/2006",

"01/08/2006", "01/09/2006", "02/09/2006", "03/09/2006", "06/09/2006", "11/08/2006", "22/08/2006", "24/08/2006", "28/08/2006", "30/08/2006" ), class = "factor"), value = c(48, 50, 150, 100, 30, 30, 100,11, 5, 3, 100, 100, 48, 50, 150, 100, 30, 100, 100)), .Names = c("id",

"date", "value"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19")) f <- function(x) { idx <- which(x$value > 50 & c(x$value[-1], 0) > 50) if (length(idx) > 0) x[idx[1],]

}

do.call(rbind, by(DF, DF$id, f))

On 8/22/06, Bonfigli Sandro <bonfigli@inmi.it> wrote:

> I have a dataframe with the following structure

*>
**> id date value
**> -------------------------
**> 1 22/08/2006 48
**> 1 24/08/2006 50
**> 1 28/08/2006 150
**> 1 30/08/2006 100
**> 1 01/09/2006 30
**> 2 11/08/2006 30
**> 2 22/08/2006 100
**> 2 28/08/2006 11
**> 2 02/09/2006 5
**> 3 01/07/2006 3
**> 3 01/08/2006 100
**> 3 01/09/2006 100
**> 4 22/08/2006 48
**> 4 24/08/2006 50
**> 4 28/08/2006 150
**> 4 30/08/2006 100
**> 4 01/09/2006 30
**> 4 03/09/2006 100
**> 4 06/09/2006 100
**>
**>
**> N.B.: dates in european format; ordered dataframe
**>
**> For each ID I need to select the first occurrence of
**> all the rows which are the first of at least two with
**> "value" >= 50.
**>
**> Rather convoluted explication. I mean that for each id I have to select
**> the first row in which value is > 50 only if at least the following row
**> has "value" > 50 too. If this is not true I repeat the test for all the
**> following rows in which "value" > 50 untill I find a record that respects
**> the condition
**>
**> this means that with my example dataframe the result is :
**> id date value
**> -------------------------
**> 1 28/08/2006 150
**> 3 01/08/2006 100
**> 4 28/08/2006 150
**>
**> It's clear that a for loop would work but I think that that is a better
**> way.
**>
**> I tried "by" and could obtain the first row for wich "value" is > 50.
**>
**> I thought of an iterative process (delete the first row > 50, find the
**> second row > 50, examine if there are rows in the middle) but it
**> is quite inelegant as if the first value is not the "good" one I have to
**> repeat the process for a a priori unknown number of times.
**>
**> Thanks in advance for Your help
**>
**> Sandro Bonfigli
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Aug 23 09:26:17 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Wed 23 Aug 2006 - 12:22:19 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*