Marc Schwartz wrote:
> on 06/13/2008 11:10 PM T.D.Rudolph wrote:

```>> I have a dataframe, x, with over 60,000 rows that contains one Factor,
>> "id",
>> with 27 levels.
>> The dataframe contains numerous continuous values (along column "diff")
>> per
>> day (column "date") for every level of id.  I would like to select only
>> one
>> row per animal per day, i.e. that containing the minimum value of "diff",
>> along the full length of 1:nrow(x).  I am not yet able to conduct
>> anything
>> beyond the simplest of functions and I was hoping someone could suggest
>> an
>> effective way of producing this output.
>>
>> e.g. given this input:
>>
>> id  day         diff
>> 1  01-01-09  0.5
>> 1  01-01-09  0.7
>> 2  01-01-09  0.2
>> 2  01-01-09  0.4
>> 1  01-02-09  0.1
>> 1  01-02-09  0.3
>> 2  01-02-09  0.3
>> 2  01-02-09  0.4
>>
>> I would like to produce this output:
>> id day          diff
>> 1  01-01-09  0.5
>> 2  01-01-09  0.2
>> 1  01-02-09  0.1
>> 2  01-02-09  0.3
>>
>> It doesn't seem extremely difficult but I'm sure there are easier ways
>> than
>> how I am currently approaching it!
> > DF
> id day diff
> 1 1 01-01-09 0.5
> 2 1 01-01-09 0.7
> 3 2 01-01-09 0.2
> 4 2 01-01-09 0.4
> 5 1 01-02-09 0.1
> 6 1 01-02-09 0.3
> 7 2 01-02-09 0.3
> 8 2 01-02-09 0.4
> > aggregate(DF\$diff, list(id = DF\$id, day = DF\$day), min, na.rm = TRUE)
> id day x
> 1 1 01-01-09 0.5
> 2 2 01-01-09 0.2
> 3 1 01-02-09 0.1
> 4 2 01-02-09 0.3
> Note that I have not converted the 'day' column to a 'date' class. You
> would need to do that to perform any other date related operations
> (including chronological sorting) on that column. See ?as.Date for more
> information. For example:
>
> DF\$day <- as.Date(DF\$day, format = "%m-%d-%y")
>
