Re: [Rd] aggregate(empty data.frame) (PR#13167)

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed, 15 Oct 2008 13:15:29 +0100 (BST)

On Wed, 15 Oct 2008, prokaj_at_cs.elte.hu wrote:

> Full_Name: Vilmos Prokaj
> Version: R-2..7.1
> OS: Win XP
> Submission from: (NULL) (157.181.227.218)
>
>
> The 'aggregate' function on an empty data.frame generate an error, however it
> should return according to the documentation an empty data.frame.

Please explain that to me: I don't see it says so.

What I see is

      'aggregate.data.frame' is the data frame method.  If 'x' is not a
      data frame, it is coerced to one.  Then, each of the variables
      (columns) in 'x' is split into subsets of cases (rows) of
      identical combinations of the components of 'by', and 'FUN' is
      applied to each such subset with further arguments in '...' passed
      to it. (I.e., 'tapply(VAR, by, FUN, ..., simplify = FALSE)' is
      done for each variable 'VAR' in 'x', conveniently wrapped into one
      call to 'lapply()'.) Empty subsets are removed, and the result is
      reformatted into a data frame containing the variables in 'by' and
      'x'.

Since all the subsets are empty, there is no result to be reformatted. In particular the second and third columns of your example have types that can only be determined by running sum() and since all groups are empty, sum() is never run. We can't create a data frame that would be consistent with that returned for one or more groups via the documented algorithm.

The error message could definitely be clearer, but I don't see an alternative to giving an error.

> e.g.
> z<-data.frame(a=integer(0),b=numeric(0))
> aggregate(z,by=z[1],FUN=sum)
>
> In a more realistic situation 'z' is of the form z<-zz[cond,] where cond is a
> computed logical vector and zz is not empty data.frame.

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed 15 Oct 2008 - 12:26:19 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 15 Oct 2008 - 17:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive