Re: [Rd] c.factor

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Tue 14 Nov 2006 - 18:22:59 GMT

Well, R has managed without a factor method for c() for most of its decade of existence (not that it originally had factors as we know them).

I would argue that factors are best viewed as an enumeration type, and anything which silently changes their level set is a bad idea. I can see a case for a c() method for factors that combines factors with the same level sets, but I can also see this is best done by users who know the level sets are same (c.factor would have to expend a considerable effort to check).

You also need to consider the dispatch rules. c.factor will be called whenever the first argument is a factor, whatever the others are. S4 (I think, definitely S4-based versions of S-PLUS) has an alternative concat() that works differently (recursively) and seems a more natural model.

On Tue, 14 Nov 2006, Marc Schwartz wrote:

> On Tue, 2006-11-14 at 11:51 -0600, Marc Schwartz wrote:
>> On Tue, 2006-11-14 at 16:36 +0000, Matthew Dowle wrote:

>>> Hi,
>>>
>>> Given factors x and y, c(x,y) does not seem to return a useful result :
>>>> x
>>> [1] a b c d e
>>> Levels: a b c d e
>>>> y
>>> [1] d e f g h
>>> Levels: d e f g h
>>>> c(x,y)
>>> [1] 1 2 3 4 5 1 2 3 4 5
>>>>
>>>
>>> Is there a case for a new method c.factor as follows? Does something
>>> similar exist already? Is there a better way to write the function?
>>>
>>>> c.factor = function(x,y)
>>> {
>>> newlevels = union(levels(x),levels(y))
>>> m = match(levels(y), newlevels)
>>> ans = c(unclass(x),m[unclass(y)])
>>> levels(ans) = newlevels
>>> class(ans) = "factor"
>>> ans
>>> }
>>>> c(x,y)
>>> [1] a b c d e d e f g h
>>> Levels: a b c d e f g h
>>>> as.integer(c(x,y))
>>> [1] 1 2 3 4 5 4 5 6 7 8
>>>>
>>>
>>> Regards,
>>> Matthew
>>
>> I'll defer to others as to whether or not there is a basis for c.factor,
>> however:
>>
>> c.factor <- function(...)
>> {
>>   args <- list(...)
>>
>>   # this could be optional
>>   if (!all(sapply(args, is.factor)))
>>    stop("All arguments must be factors")
>>
>>   factor(unlist(lapply(args, function(x) as.character(x))))
>> }
>
>
> That last line can even be cleaned up, as I was doing something else
> initially:
>
> c.factor <- function(...)
> {
>  args <- list(...)
>
>  if (!all(sapply(args, is.factor)))
>   stop("All arguments must be factors")
>
>  factor(unlist(lapply(args, as.character)))
> }
>
>
> Marc
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Nov 15 05:30:16 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 15 Nov 2006 - 08:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.