Re: [R] obtaining first and last record for rows with same identifier

From: Sean Davis <sdavis2_at_mail.nih.gov>
Date: Wed 25 May 2005 - 04:37:31 EST

If you have your data.frame ordered by the patid, you can use the function rle in combination with cumsum. As a vector example:

 > a <- rep(c('a','b','c'),10)
 > a
  [1] "a" "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" "a"
[20] "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c"  > b <- a[order(a)]
 > b
  [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "b" "b"
[20] "b" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c"  > l <- rle(b)$length
 > cbind(l,cumsum(l),cumsum(l)-l+1)

       l

[1,] 10 10  1
[2,] 10 20 11
[3,] 10 30 21

# use the line below to get the length of the block of the dataframe, the start, and then end indices
 > cbind(l,cumsum(l)-l+1,cumsum(l))

       l

[1,] 10  1 10
[2,] 10 11 20
[3,] 10 21 30

 >

Sean

On May 24, 2005, at 2:27 PM, sms13+@pitt.edu wrote:

> I have a dataframe that contains fields such as patid, labdate,
> labvalue.
> The same patid may show up in multiple rows because of lab
> measurements on multiple days. Is there a simple way to obtain just
> the first and last record for each patient, or do I need to write some
> code that performs that.
>
> Thanks,
> Steven
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed May 25 04:40:47 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:01 EST