Re: [R] subset and as.POSIXct / as.POSIXlt oddness

From: Michael Bach <phaebz_at_gmail.com>
Date: Thu, 24 Mar 2011 16:02:18 +0200

Hi David,

your approach selects the datapoints between "2007-06-01 10:00:00" and "2007-06-01 14:00:00" true enough. However, my real dataset is for several months and years. So I need data points between 10:00:00 and 14:00:00 only - independent of the date. I thought the name for that is "aggregating". Hope my aim is clearer now.

In fact, on top of that I would like to easily specify ranges of days, weeks and months as subintervals to aggregate on. Say e.g. every march to may or every first ten days of every month. Are there already helper functions for this?

Thanks for your help David.

On Thu, Mar 24, 2011 at 3:44 PM, David Winsemius <dwinsemius_at_comcast.net>wrote:

>
> On Mar 24, 2011, at 9:29 AM, Michael Bach wrote:
>
> Dear R users,
>>
>> Given this data:
>>
>> x <- seq(1,100,1)
>> dx <- as.POSIXct(x*900, origin="2007-06-01 00:00:00")
>> dfx <- data.frame(dx)
>>
>> Now to play around for example:
>>
>> subset(dfx, dx > as.POSIXct("2007-06-01 16:00:00"))
>>
>> Ok. Now for some reason I want to extract the datapoints between hours
>> 10:00:00 and 14:00:00, so I thought well:
>>
>> subset(dfx, dx > as.POSIXct("2007-06-01 16:00:00"), 14 >
>> as.POSIXlt(dx)$hour
>> & as.POSIXlt(dx)$hour < 10)
>> Error in as.POSIXlt.numeric(dx) : 'origin' must be supplied
>>
>> Well that did not work. But why does the following work?
>>
>> 14 > as.POSIXlt(dx)$hour & as.POSIXlt(dx)$hour < 10
>>
>> Is there something I miss about subset()? Or is there even another way of
>> aggregating over an hourly time interval in a nicer way?
>>
>
> I'm not sure what problem is odccuring with your method. The way I would
> have done it worked. The findInterval function also seemed to allow
> classification by intervals of 3600 seconds:
>
> > subset(dfx, dx > as.POSIXct("2007-06-01 10:00:00") & dx <
> as.POSIXct("2007-06-01 14:00:00"))
> dx
> 41 2007-06-01 10:15:00
> 42 2007-06-01 10:30:00
> 43 2007-06-01 10:45:00
> 44 2007-06-01 11:00:00
> 45 2007-06-01 11:15:00
> 46 2007-06-01 11:30:00
> 47 2007-06-01 11:45:00
> 48 2007-06-01 12:00:00
> 49 2007-06-01 12:15:00
> 50 2007-06-01 12:30:00
> 51 2007-06-01 12:45:00
> 52 2007-06-01 13:00:00
> 53 2007-06-01 13:15:00
> 54 2007-06-01 13:30:00
> 55 2007-06-01 13:45:00
>
> > findInterval(dfx$dx, c( as.numeric(range(dfx$dx)[1] +(1:24)*3600) ) )
> [1] 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5
> 5 6 6 6 6 7
> [30] 7 7 7 8 8 8 8 9 9 9 9 10 10 10 10 11 11 11 11 12 12 12 12
> 13 13 13 13 14 14
> [59] 14 14 15 15 15 15 16 16 16 16 17 17 17 17 18 18 18 18 19 19 19 19 20
> 20 20 20 21 21 21
> [88] 21 22 22 22 22 23 23 23 23 24 24 24 24
>
>
>> Best Regards,
>> Michael Bach
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Mar 2011 - 14:05:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 Mar 2011 - 16:00:23 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive