Re: [R] Subset by Factor by date

From: T.D.Rudolph <prairie.picker_at_gmail.com>
Date: Fri, 13 Jun 2008 22:25:54 -0700 (PDT)

aggregate() is indeed a useful function in this case, but it only returns the columns by which it was grouped. Is there a way I can use this while simultaneously retaining all the other column values in the dataframe?

e.g. add superfluous (yet pertinent for later) column containing any information at all and retain it in the final output

Marc Schwartz wrote:
>
> on 06/13/2008 11:10 PM T.D.Rudolph wrote:

>> I have a dataframe, x, with over 60,000 rows that contains one Factor,
>> "id",
>> with 27 levels.  
>> The dataframe contains numerous continuous values (along column "diff")
>> per
>> day (column "date") for every level of id.  I would like to select only
>> one
>> row per animal per day, i.e. that containing the minimum value of "diff",
>> along the full length of 1:nrow(x).  I am not yet able to conduct
>> anything
>> beyond the simplest of functions and I was hoping someone could suggest
>> an
>> effective way of producing this output.
>> 
>> e.g. given this input:
>> 
>> id  day         diff
>> 1  01-01-09  0.5
>> 1  01-01-09  0.7
>> 2  01-01-09  0.2
>> 2  01-01-09  0.4
>> 1  01-02-09  0.1
>> 1  01-02-09  0.3
>> 2  01-02-09  0.3
>> 2  01-02-09  0.4
>> 
>> I would like to produce this output:
>> id day          diff
>> 1  01-01-09  0.5
>> 2  01-01-09  0.2
>> 1  01-02-09  0.1
>> 2  01-02-09  0.3
>> 
>> It doesn't seem extremely difficult but I'm sure there are easier ways
>> than
>> how I am currently approaching it!

>
> See ?aggregate
>
> > DF
> id day diff
> 1 1 01-01-09 0.5
> 2 1 01-01-09 0.7
> 3 2 01-01-09 0.2
> 4 2 01-01-09 0.4
> 5 1 01-02-09 0.1
> 6 1 01-02-09 0.3
> 7 2 01-02-09 0.3
> 8 2 01-02-09 0.4
>
>
> > aggregate(DF$diff, list(id = DF$id, day = DF$day), min, na.rm = TRUE)
> id day x
> 1 1 01-01-09 0.5
> 2 2 01-01-09 0.2
> 3 1 01-02-09 0.1
> 4 2 01-02-09 0.3
>
>
> Note that I have not converted the 'day' column to a 'date' class. You
> would need to do that to perform any other date related operations
> (including chronological sorting) on that column. See ?as.Date for more
> information. For example:
>
> DF$day <- as.Date(DF$day, format = "%m-%d-%y")
>
>
> HTH,
>
> Marc Schwartz
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
-- 
View this message in context: http://www.nabble.com/Subset-by-Factor-by-date-tp17835631p17836046.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 14 Jun 2008 - 05:31:02 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 14 Jun 2008 - 06:30:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive