From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Wed, 25 Jul 2007 11:01:34 -0400

}

2001 12 21 30 39

2002 48 57 66 NA

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 25 Jul 2007 - 15:52:23 GMT

Date: Wed, 25 Jul 2007 11:01:34 -0400

On 7/25/07, Paul Gilbert <pgilbert_at_bank-banque-canada.ca> wrote:

> (moved from r-help)

*>
**> Achim Zeileis wrote:
**>
**> >On Wed, 25 Jul 2007, laimonis wrote:
**> >
**> >
**> >
**> >>Consider the following scrap of code:
**> >>
**> >>
**> >
**> >...slightly modified to
**> > x1 <- ts(1:24, start = c(2000, 10), freq = 12)
**> > x2 <- ts(1:24, start = c(2000, 11), freq = 12)
**> >
**> >and then
**> > y1 <- aggregate(x1, nfreq = 4)
**> >gives the desired result while
**> > y2 <- aggregate(x2, nfreq = 4)
**> >probably does not what you would like it to do.
**> >
**>
**> I've been caught by this before, and complained before. It does not do
**> what most people that work with economic time series would expect. (One
**> might argue that not all time series are economic, but other time series
**> don't usually fit with ts very well.) At the very least aggregate
**> should issue a warning. Quarterly observations are for quarters of the
**> year, so just arbitrarily grouping in 3 beginning with the first
**> observation is *extremely* misleading, even if it is documented.
**>
**> [ BTW, there is a bug in the print method here (R-2.5.1 on Linux) :
**> > y2 <- aggregate(x2, nfreq = 4)
**> >
**> > y2
**> Error in rep.int("", start.pad) : invalid number of copies in rep.int()
**> > traceback()
**> 5: rep.int("", start.pad)
**> 4: as.vector(data)
**> 3: matrix(c(rep.int("", start.pad), format(x, ...), rep.int("",
**> end.pad)), nc = fr.x, byrow = TRUE, dimnames = list(dn1,
**> dn2))
**> 2: print.ts(c(6L, 15L, 24L, 33L, 42L, 51L, 60L, 69L))
**> 1: print(c(6L, 15L, 24L, 33L, 42L, 51L, 60L, 69L))
**> ]
**>
**> ....
**>
**> >Currently, the "zoo" implementation allows this: Coercing back and forth
**> >gives:
**> > library("zoo")
**> > z1 <- as.ts(aggregate(as.zoo(x1), as.yearqtr, sum))
**> > z2 <- as.ts(aggregate(as.zoo(x2), as.yearqtr, sum))
**> >
**> >
**> This is better, but still potentially misleading. I would prefer a
**> default NA when only some of the observations are available for a
**> quarter (and the syntax is a bit cumbersome for something one needs to
**> do fairly often).
**>
*

That can be readily handled with a custom sum function:

sum.na <- function(x, width, ...) {

if (!missing(width) && length(x) != width) x <- NA * x sum(x, ...)

}

> as.ts(aggregate(as.zoo(x2), as.yearqtr, sum.na, width = 3))

Qtr1 Qtr2 Qtr3 Qtr4 2000 NA

2001 12 21 30 39

2002 48 57 66 NA

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 25 Jul 2007 - 15:52:23 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 25 Jul 2007 - 16:36:50 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*