From: Jeff Newmiller <jdnewmil_at_dcn.davis.ca.us>

Date: Thu, 24 Mar 2011 07:52:47 -0700

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Mar 2011 - 14:55:46 GMT

Date: Thu, 24 Mar 2011 07:52:47 -0700

On 03/24/2011 06:29 AM, Michael Bach wrote:

> Dear R users,

*>
**> Given this data:
**>
**> x<- seq(1,100,1)
**> dx<- as.POSIXct(x*900, origin="2007-06-01 00:00:00")
**> dfx<- data.frame(dx)
**>
**> Now to play around for example:
**>
**> subset(dfx, dx> as.POSIXct("2007-06-01 16:00:00"))
**>
**> Ok. Now for some reason I want to extract the datapoints between hours
**> 10:00:00 and 14:00:00, so I thought well:
**>
**> subset(dfx, dx> as.POSIXct("2007-06-01 16:00:00"), 14> as.POSIXlt(dx)$hour
**> & as.POSIXlt(dx)$hour< 10)
**> Error in as.POSIXlt.numeric(dx) : 'origin' must be supplied
**>
**> Well that did not work. But why does the following work?
**>
**> 14> as.POSIXlt(dx)$hour& as.POSIXlt(dx)$hour< 10
**>
*

It does work. Try it.

> Is there something I miss about subset()?

You have given three arguments to subset. Your third argument is a poor choice for selecting columns. Try:

subset(dfx, dx> as.POSIXct("2007-06-01 16:00:00")& 14> as.POSIXlt(dx)$hour & as.POSIXlt(dx)$hour< 10)

or better yet,

tmp<- as.POSIXlt( dfx$dx )

subset(dfx, dx> as.POSIXct("2007-06-01 16:00:00")& 14> tmp$hour& tmp$hour< 10)

since the as.POSIXlt is a rather heavyweight operation.

> Or is there even another way of

*> aggregating over an hourly time interval in a nicer way?
*

This is not aggregation. This is selection. It is only when you summarize the selected data that you are aggregating.

Normally, the term aggregating is applied when you use a grouping column and collapse many values with the same characteristics into one value per set of characteristics. For example using base functions,

dfx$interval <- cut(tmp$hour,c(-1,10,14,24)) aggregate(dfx$dx,list(Interval=dfx$interval),length)

or

aggregate(dfx$dx,list(Hour=tmp$hour),length)

but I find that the plyr library is much more user-friendly than aggregate.

> Best Regards,

*> Michael Bach
**>
**> [[alternative HTML version deleted]]
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Mar 2011 - 14:55:46 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Thu 24 Mar 2011 - 15:40:25 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*