Re: [R] Looking for the first observation within the month

From: jim holtman <jholtman_at_gmail.com>
Date: Sun, 27 May 2007 07:28:30 -0400

Here is one way of doing it:

> x <- "Date Observation

+ 2007-05-23              20
+ 2007-05-22              30
+ 2007-05-21              10
+ 2007-04-10              50
+ 2007-04-09              40
+ 2007-04-07              30
+ 2007-03-05              10"

> x <- read.table(textConnection(x), header=TRUE,
+ colClasses=c("POSIXct", "integer"))

> # split the data by year-month and find the minimum day
> minDay <- lapply(split(x, cut(x$Date, breaks='month')), function(month){
+ month[which.min(month$Date),] # minimum date in the month + })
> do.call('rbind', minDay) # put it back together in a dataframe
                 Date Observation
2007-03-01 2007-03-05          10
2007-04-01 2007-04-07          30
2007-05-01 2007-05-21          10

>

On 5/27/07, Albert Pang <albert.pang_at_mac.com> wrote:
>
> Hi all, I have a simple data frame, first list is a list of dates (in
> "%Y-%m-%d" format) and second list an observation on that particular
> date. There might not be observations everyday. Let's just say
> there are no observations on saturdays and sundays. Now I want to
> select the first observation of every month into a list. Is there an
> easy way to do that?
>
> Date Observation
> ---- -----------
> 2007-05-23 20
> 2007-05-22 30
> 2007-05-21 10
>
> 2007-04-10 50
> 2007-04-09 40
> 2007-04-07 30
>
> 2007-03-05 10
>
> The result I need is the data frame
>
> 2007-05-21 10
> 2007-04-07 30
> 2007-03-05 10
>
> or I am equally happy with just the vector c(10, 30, 10)
>
> I am new to R and after going through the manuals and the
> documentation I can gather, I have come up with a convoluted way of
> doing it
>
> 1) I first get the Date into a vector. (I am articificially
> reproducing this vector below and call it A)
>
> > A<-c( as.Date("2007-05-23"), as.Date("2007-05-22"), as.Date
> ("2007-05-21"), as.Date("2007-04-10"), as.Date("2007-04-09"), as.Date
> ("2007-04-07"), as.Date("2007-03-05"))
> > A
> [1] "2007-05-23" "2007-05-22" "2007-05-21" "2007-04-10" "2007-04-09"
> [6] "2007-04-07" "2007-03-05"
>
>
> 2) use cut with breaks falling on the months
>
> > B<-cut(A, breaks="month")
> > B
> [1] 2007-05-01 2007-05-01 2007-05-01 2007-04-01 2007-04-01 2007-04-01
> [7] 2007-03-01
> Levels: 2007-03-01 2007-04-01 2007-05-01
>
>
> 3) then split to get a list of vectors group by the boundary of the
> date
>
> > C<-split(A, B)
> > C
> $`2007-03-01`
> [1] "2007-03-05"
>
> $`2007-04-01`
> [1] "2007-04-10" "2007-04-09" "2007-04-07"
>
> $`2007-05-01`
> [1] "2007-05-23" "2007-05-22" "2007-05-21"
>
>
> 4) in a for loop I loop through the elements within the list (the
> elements are vectors of dates) with each vector I find the minimum
> and concatentate it to a final vector D
>
> > D<-numeric()
> > for ( i in 1:length(C)){ D <- c( D, min(C[[i]]))}
> > class(D)<-"Date"
> > D
> [1] "2007-03-05" "2007-04-07" "2007-05-21"
>
> Next with D, I then go back and find out the positions of the
> elements in D within A. And then use the result as an index vector
> into the vector of observations (which is not shown here) I feel
> sure I am doing it the stupid way (or the procedural way)
>
> Is there a more declarative way of doing it? Any pointers will be
> greatly appreciated!
>
> Thanks a lot in advance,
>
> Albert Pang
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]

______________________________________________
R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 27 May 2007 - 11:35:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 27 May 2007 - 15:30:59 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.