Re: [R] Irregular Time Series: zoo & its: Pros & Cons

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Fri 26 Aug 2005 - 09:01:59 EST

On 8/25/05, David James <djames@frontierassoc.com> wrote:
> Hello,
>
> I'm working with irregular time series data. What do you all think
> about the strengths and weaknesses of the "zoo" and "its" packages?

I have worked on the development of zoo with Achim Zeileis so I will just speak to that one.

The key to notice about zoo is its independence of index class (i.e. date, time or date/time class) making it general in nature so that you can use any one you like. It supports all the standard date and time classes in R and you can add your own too. In your case you probably want to use chron (or POSIXct if you need time zones) or you could create your own special hourly class. See the Help Desk article in R News 4/1 for a discussion of the main classes and see the table at the end of that article for various idioms which you may need.

zoo supports not only irregular but also weakly regular series (zooreg class) which are ones that have an underlying regularity, e.g. hourly, monthly even though they may not have every hour, month, etc. filled in.

zoo has a PDF manual available via (in R):

   library(zoo)
   vignette("zoo")

zoo can work together with the 'its' class and 'ts' class via as.zoo, as.its and as.ts.

>
> I've installed and skimmed the documentation on both packages. I was
> hoping to get a little guidance from the user community before
> proceeding further.
>
> In case anyone is interested in my particular problem: I'm looking
> at some (surface) temperature data from NOAA: http://
> cdo.ncdc.noaa.gov/ulcd/ULCD
> It is (irregular) time series format. The NOAA data reports year,
> month, date, hour, and minute. I want to group the data into hourly
> chunks. However, sometimes there are multiple observation per hour
> -- i.e an observation at 3:45 and 3:56. Also, sometimes a particular
> hour may be missing altogether. I need to clean up the data so that
> each hour has one and only one data point.

Using the chron date/time class here is an example:

library(chron)
library(zoo)

set.seed(1)

# create zoo series with random dates/times between tt0 and tt1 # also random values
set.seed(1)
n <- 25
tt0 <- chron("01/01/90")
tt1 <- chron("01/01/00")
tt <- sort(as.numeric(tt1-tt0)*runif(n)+tt0) z <- zoo(rnorm(n), tt) # create zoo series from values and date/times

# aggregate by hour choosing first data point if there are mulitples.
# The arguments are (1) the zoo series (2) time rounded to the hour
# (3) aggregate function to use -- indexing in this case, (4) an
# argument to the indexing function -- in this case its 1 since
# we want the first element.  See ?aggregate.zoo
z.hr <- aggregate(z, chron(floor(24*as.numeric(tt))/24), "[", 1)

# plot hourly series, see ?plot.zoo
plot(z.hr)

Packages with explicit support for zoo are strucchange, dynlm and dyn. (dyn also supports ts and its.)

>
> I'm relatively new to R, but I think I'm getting a hold on it pretty
> well so far. I used to do a lot with MATLAB, and there seem to be

Check out

   http://cran.r-project.org/doc/contrib/R-and-octave-2.txt

> many parallels between it and R. I have background in public policy
> and econometrics.

Check out

   http://cran.r-project.org/src/contrib/Views/



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Aug 26 09:18:00 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:53:07 EST