Re: [Rd] (PR#9666) 'aggregate' should preserve level ordering of

From: <ripley_at_stats.ox.ac.uk>
Date: Mon, 14 May 2007 11:04:48 +0200 (CEST)


On Tue, 8 May 2007, prechelt_at_inf.fu-berlin.de wrote:

> Full_Name: Lutz Prechelt
> Version: 2.4.1
> OS: Windows XP
> Submission from: (NULL) (160.45.111.67)
>
>
> aggregate (from package stats) should preserve the
> ordering of levels of factors it works on and also their
> 'ordered' attribute if present.
> But it does not.

In fact it treats all grouping variables consistently, reducing them to their levels and then data.frame does as.factor on the resulting column.

It is not at all clear this is desirable. Take the example on the help page: 'Cold' is reported as a factor even though it is logical. It seems better not to coerce any of the grouping factors when putting into the data frame but rather to index the original variable, and I have implemented that for R-devel: as a side effect your example works as you would like. This does mean that grouping variables that are not factors and cannot be inserted into a data frame will no longer work.

> Here is an example:
>
> ff = factor(c("a","b","a","b"),levels=c("b","a"),ordered=T)
> agg = aggregate(1:4, list(groups=ff), sum)
> print(levels(agg$groups)) # should be: "b" "a"
> [1] "a" "b"
> print(is.ordered(agg$groups)) # should be: TRUE
> [1] FALSE
>
> -----
>
> ?aggregate ignores the issue completely:
> - the terms 'order' or 'level' do not occur in the
> text at all
> - the term 'factor' is mentioned only once:
> "The elements of the list will be coerced to
> factors (if they are not already factors)."
>
> -----
>
> This issue made me write the following code used
> for preparing the data for a barchart:
>
> df.a = aggregate(df[,value.var],
> list(grouping=dfgrouping, other=dfsubbar.var),
> FUN=FUN)
> if (is.factor(dfsubbar.var)) { # R 2.4: this should be done by 'aggregate'
> df.a$other = factor(df.a$other,
> levels=levels(dfsubbar.var),
> ordered=is.ordered(dfsubbar.var))
> }
>
> Cumbersome.
>
> R is great anyway. Thanks for your service building it!
>
> Lutz Prechelt
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Mon 14 May 2007 - 09:09:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 May 2007 - 09:33:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.