# Re: [R] Looking for the first observation within the month

From: Albert Pang <albert.pang_at_mac.com>
Date: Sun, 27 May 2007 22:58:52 +0800

I have only just able to dissect Jim's solution and realize I am actually not very far away from the answer. One last step was to use "lapply". Jim, thanks again for the help.

Gabor, thanks for the suggestion. Let me have a read on what the zoo package is about. Thanks a lot for the pointer!

Albert

On May 27, 2007, at 10:48 PM, Gabor Grothendieck wrote:

> Use the zoo package to represent data like this.
> Here time(z) is a vector of the dates and as.yearmon(time(z))
> is the year/month of each date. With FUN=head1, ave picks out the
> first
> date in any month and aggregate then aggregates over all
> values in the same year/month choosing the first one.
> library(zoo)
> For more on zoo try:
> library(zoo)
> vignette("zoo")
> and also read the Help Desk article in R News 4/1 about dates.
> On 5/27/07, Albert Pang <albert.pang_at_mac.com> wrote:
>> Hi all, I have a simple data frame, first list is a list of dates (in
>> "%Y-%m-%d" format) and second list an observation on that particular
>> date. There might not be observations everyday. Let's just say
>> there are no observations on saturdays and sundays. Now I want to
>> select the first observation of every month into a list. Is there an
>> easy way to do that?
>> Date Observation
>> ---- -----------
>> 2007-05-23 20
>> 2007-05-22 30
>> 2007-05-21 10
>> 2007-04-10 50
>> 2007-04-09 40
>> 2007-04-07 30
>> 2007-03-05 10
>> The result I need is the data frame
>>
>> 2007-05-21 10
>> 2007-04-07 30
>> 2007-03-05 10
>> or I am equally happy with just the vector c(10, 30, 10)
>>
>> I am new to R and after going through the manuals and the
>> documentation I can gather, I have come up with a convoluted way of
>> doing it
>>
>> 1) I first get the Date into a vector. (I am articificially
>> reproducing this vector below and call it A)
>>
>> > A<-c( as.Date("2007-05-23"), as.Date("2007-05-22"), as.Date
>> ("2007-05-21"), as.Date("2007-04-10"), as.Date("2007-04-09"), as.Date
>> ("2007-04-07"), as.Date("2007-03-05"))
>> > A
>> [1] "2007-05-23" "2007-05-22" "2007-05-21" "2007-04-10" "2007-04-09"
>> [6] "2007-04-07" "2007-03-05"
>>
>> 2) use cut with breaks falling on the months
>>
>> > B<-cut(A, breaks="month")
>> > B
>> [1] 2007-05-01 2007-05-01 2007-05-01 2007-04-01 2007-04-01 2007-04-01
>> [7] 2007-03-01
>> Levels: 2007-03-01 2007-04-01 2007-05-01
>>
>> 3) then split to get a list of vectors group by the boundary of the
>> date
>>
>> > C<-split(A, B)
>> > C
>> \$`2007-03-01`
>> [1] "2007-03-05"
>> \$`2007-04-01`
>> [1] "2007-04-10" "2007-04-09" "2007-04-07"
>>
>> \$`2007-05-01`
>> [1] "2007-05-23" "2007-05-22" "2007-05-21"
>>
>> 4) in a for loop I loop through the elements within the list (the
>> elements are vectors of dates) with each vector I find the minimum
>> and concatentate it to a final vector D
>>
>> > D<-numeric()
>> > for ( i in 1:length(C)){ D <- c( D, min(C[[i]]))}
>> > class(D)<-"Date"
>> > D
>> [1] "2007-03-05" "2007-04-07" "2007-05-21"
>>
>> Next with D, I then go back and find out the positions of the
>> elements in D within A. And then use the result as an index vector
>> into the vector of observations (which is not shown here) I feel
>> sure I am doing it the stupid way (or the procedural way)
>>
>> Is there a more declarative way of doing it? Any pointers will be
>> greatly appreciated!
>>
>> Thanks a lot in advance,
>>
>> Albert Pang
>> [[alternative HTML version deleted]]
