I don't think you need a retain statement.

if first.patientID ;
if last.patientID ;

ought to do it.

It's actually better than the Vilno version, I must admit, a bit more concise:

if ( not firstrow(patientID) ) deleterow ;

On 6/10/07, Peter Dalgaard <> wrote:
> Douglas Bates wrote:
> > Frank Harrell indicated that it is possible to do a lot of difficult
> > data transformation within R itself if you try hard enough but that
> > sometimes means working against the S language and its "whole object"
> > view to accomplish what you want and it can require knowledge of
> > subtle aspects of the S language.
> >
> Actually, I think Frank's point was subtly different: It is *because* of
> the differences in view that it sometimes seems difficult to find the
> way to do something in R that is apparently straightforward in SAS.
> I.e. the solutions exist and are often elegant, but may require some
> lateral thinking.


> Case in point: Finding the first or the last observation for each
> subject when there are multiple records for each subject. The SAS way
> would be a datastep with IF-THEN-DELETE, and a RETAIN statement so that
> you can compare the subject ID with the one from the previous record,
> working with data that are sorted appropriately.

> You can do the same thing in R with a for loop, but there are better
> ways e.g.
> subset(df,!duplicated(ID)), and subset(df, rev(!duplicated(rev(ID))), or
> maybe
>"rbind",lapply(split(df,df$ID), head, 1)), resp. tail. Or
> something involving aggregate(). (The latter approaches generalize
> better to other within-subject functionals like cumulative doses, etc.).

> The hardest cases that I know of are the ones where you need to turn one
> record into many, such as occurs in survival analysis with
> time-dependent, piecewise constant covariates. This may require
> "transposing the problem", i.e. for each interval you find out which
> subjects contribute and with what, whereas the SAS way would be a
> within-subject loop over intervals containing an OUTPUT statement.

> Also, there are some really weird data formats, where e.g. the input
> format is different in different records. Back in the 80's where
> punched-card input was still common, it was quite popular to have one
> card with background information on a patient plus several cards
> detailing visits, and you'd get a stack of cards containing both kinds.
> In R you would most likely split on the card type using grep() and then
> read the two kinds separately and merge() them later.
