From: David Winsemius <dwinsemius_at_comcast.net>

Date: Thu, 24 Mar 2011 11:28:45 -0400

54 2007-06-01 13:30:00

55 2007-06-01 13:45:00

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Mar 2011 - 15:36:19 GMT

Date: Thu, 24 Mar 2011 11:28:45 -0400

On Mar 24, 2011, at 10:02 AM, Michael Bach wrote:

> Hi David,

*>
**> your approach selects the datapoints between "2007-06-01 10:00:00"
**> and "2007-06-01 14:00:00" true enough. However, my real dataset is
**> for several months and years. So I need data points between 10:00:00
**> and 14:00:00 only - independent of the date. I thought the name for
**> that is "aggregating". Hope my aim is clearer now.
**>
**> In fact, on top of that I would like to easily specify ranges of
**> days, weeks and months as subintervals to aggregate on. Say e.g.
**> every march to may or every first ten days of every month. Are there
**> already helper functions for this?
**>
**> Thanks for your help David.
*

I think you were on the path to success but stumbled with the order of you comparison (as well as the construction of the arguments as pointed out by Konstabel. Note that no values are excluded by the first clause after consideration of the the second:

14 > as.POSIXlt(dx)$hour & as.POSIXlt(dx)$hour < 10

Try:

> subset(dfx, as.POSIXlt(dx)$hour < 14 & as.POSIXlt(dx)$hour >= 10)

dx 40 2007-06-01 10:00:00 41 2007-06-01 10:15:00 42 2007-06-01 10:30:00 43 2007-06-01 10:45:00 44 2007-06-01 11:00:00 45 2007-06-01 11:15:00 46 2007-06-01 11:30:00 47 2007-06-01 11:45:00 48 2007-06-01 12:00:00 49 2007-06-01 12:15:00 50 2007-06-01 12:30:00 51 2007-06-01 12:45:00 52 2007-06-01 13:00:00 53 2007-06-01 13:15:00

54 2007-06-01 13:30:00

55 2007-06-01 13:45:00

*>
*

> On Thu, Mar 24, 2011 at 3:44 PM, David Winsemius <dwinsemius@comcast.net

*> > wrote:
**>
**> On Mar 24, 2011, at 9:29 AM, Michael Bach wrote:
**>
**> Dear R users,
**>
**> Given this data:
**>
**> x <- seq(1,100,1)
**> dx <- as.POSIXct(x*900, origin="2007-06-01 00:00:00")
**> dfx <- data.frame(dx)
**>
**> Now to play around for example:
**>
**> subset(dfx, dx > as.POSIXct("2007-06-01 16:00:00"))
**>
**> Ok. Now for some reason I want to extract the datapoints between hours
**> 10:00:00 and 14:00:00, so I thought well:
**>
**> subset(dfx, dx > as.POSIXct("2007-06-01 16:00:00"), 14 >
**> as.POSIXlt(dx)$hour
**> & as.POSIXlt(dx)$hour < 10)
**> Error in as.POSIXlt.numeric(dx) : 'origin' must be supplied
**>
**> Well that did not work. But why does the following work?
**>
**> 14 > as.POSIXlt(dx)$hour & as.POSIXlt(dx)$hour < 10
**>
**> Is there something I miss about subset()? Or is there even another
**> way of
**> aggregating over an hourly time interval in a nicer way?
**>
**> I'm not sure what problem is odccuring with your method. The way I
**> would have done it worked. The findInterval function also seemed to
**> allow classification by intervals of 3600 seconds:
**>
**> > subset(dfx, dx > as.POSIXct("2007-06-01 10:00:00") & dx <
**> as.POSIXct("2007-06-01 14:00:00"))
**> dx
**> 41 2007-06-01 10:15:00
**> 42 2007-06-01 10:30:00
**> 43 2007-06-01 10:45:00
**> 44 2007-06-01 11:00:00
**> 45 2007-06-01 11:15:00
**> 46 2007-06-01 11:30:00
**> 47 2007-06-01 11:45:00
**> 48 2007-06-01 12:00:00
**> 49 2007-06-01 12:15:00
**> 50 2007-06-01 12:30:00
**> 51 2007-06-01 12:45:00
**> 52 2007-06-01 13:00:00
**> 53 2007-06-01 13:15:00
**> 54 2007-06-01 13:30:00
**> 55 2007-06-01 13:45:00
**>
**> > findInterval(dfx$dx, c( as.numeric(range(dfx$dx)[1]
**> +(1:24)*3600) ) )
**> [1] 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5
**> 5 5 5 6 6 6 6 7
**> [30] 7 7 7 8 8 8 8 9 9 9 9 10 10 10 10 11 11 11 11 12 12
**> 12 12 13 13 13 13 14 14
**> [59] 14 14 15 15 15 15 16 16 16 16 17 17 17 17 18 18 18 18 19 19 19
**> 19 20 20 20 20 21 21 21
**> [88] 21 22 22 22 22 23 23 23 23 24 24 24 24
**>
**>
**> Best Regards,
**> Michael Bach
**>
**> [[alternative HTML version deleted]]
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
**> David Winsemius, MD
**> West Hartford, CT
**>
**>
*

David Winsemius, MD

West Hartford, CT

[[alternative HTML version deleted]]

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Mar 2011 - 15:36:19 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Thu 24 Mar 2011 - 15:40:25 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*