[R] Selection on dataframe based on order of rows

From: Bonfigli Sandro <bonfigli_at_inmi.it>
Date: Wed 23 Aug 2006 - 04:15:42 EST


I have a dataframe with the following structure

id date value


1    22/08/2006     48
1    24/08/2006     50
1    28/08/2006     150
1    30/08/2006     100
1    01/09/2006     30
2    11/08/2006     30
2    22/08/2006     100
2    28/08/2006     11
2    02/09/2006     5
3    01/07/2006     3
3    01/08/2006     100
3    01/09/2006     100
4    22/08/2006     48
4    24/08/2006     50
4    28/08/2006     150
4    30/08/2006     100
4    01/09/2006     30
4    03/09/2006     100
4    06/09/2006     100


N.B.: dates in european format; ordered dataframe

For each ID I need to select the first occurrence of all the rows which are the first of at least two with "value" >= 50.

Rather convoluted explication. I mean that for each id I have to select the first row in which value is > 50 only if at least the following row has "value" > 50 too. If this is not true I repeat the test for all the following rows in which "value" > 50 untill I find a record that respects the condition

this means that with my example dataframe the result is : id date value


1    28/08/2006     150
3    01/08/2006     100
4    28/08/2006     150

It's clear that a for loop would work but I think that that is a better way.

I tried "by" and could obtain the first row for wich "value" is > 50.

I thought of an iterative process (delete the first row > 50, find the second row > 50, examine if there are rows in the middle) but it is quite inelegant as if the first value is not the "good" one I have to repeat the process for a a priori unknown number of times.

Thanks in advance for Your help

  Sandro Bonfigli



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Aug 23 04:03:44 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 23 Aug 2006 - 10:21:40 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.