[R] Looking for the first observation within the month

From: Albert Pang <albert.pang_at_mac.com>
Date: Sun, 27 May 2007 18:40:07 +0800


Hi all, I have a simple data frame, first list is a list of dates (in "%Y-%m-%d" format) and second list an observation on that particular date. There might not be observations everyday. Let's just say there are no observations on saturdays and sundays. Now I want to select the first observation of every month into a list. Is there an easy way to do that?

Date			Observation
----			-----------
2007-05-23		20
2007-05-22		30
2007-05-21		10

2007-04-10		50
2007-04-09		40
2007-04-07		30

2007-03-05		10

The result I need is the data frame

2007-05-21		10
2007-04-07		30
2007-03-05		10

or I am equally happy with just the vector c(10, 30, 10)

I am new to R and after going through the manuals and the documentation I can gather, I have come up with a convoluted way of doing it

  1. I first get the Date into a vector. (I am articificially reproducing this vector below and call it A)

> A<-c( as.Date("2007-05-23"), as.Date("2007-05-22"), as.Date
("2007-05-21"), as.Date("2007-04-10"), as.Date("2007-04-09"), as.Date ("2007-04-07"), as.Date("2007-03-05"))
> A

[1] "2007-05-23" "2007-05-22" "2007-05-21" "2007-04-10" "2007-04-09" [6] "2007-04-07" "2007-03-05"

2) use cut with breaks falling on the months

> B<-cut(A, breaks="month")
> B

[1] 2007-05-01 2007-05-01 2007-05-01 2007-04-01 2007-04-01 2007-04-01 [7] 2007-03-01
Levels: 2007-03-01 2007-04-01 2007-05-01

3) then split to get a list of vectors group by the boundary of the date

> C<-split(A, B)
> C

$`2007-03-01`
[1] "2007-03-05"

$`2007-04-01`
[1] "2007-04-10" "2007-04-09" "2007-04-07"

$`2007-05-01`
[1] "2007-05-23" "2007-05-22" "2007-05-21"

4) in a for loop I loop through the elements within the list (the elements are vectors of dates) with each vector I find the minimum and concatentate it to a final vector D

> D<-numeric()
> for ( i in 1:length(C)){ D <- c( D, min(C[[i]]))}
> class(D)<-"Date"
> D

[1] "2007-03-05" "2007-04-07" "2007-05-21"

Next with D, I then go back and find out the positions of the elements in D within A. And then use the result as an index vector into the vector of observations (which is not shown here) I feel sure I am doing it the stupid way (or the procedural way)

Is there a more declarative way of doing it? Any pointers will be greatly appreciated!

Thanks a lot in advance,

Albert Pang

        [[alternative HTML version deleted]]



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 27 May 2007 - 10:49:32 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 27 May 2007 - 15:30:59 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.