Re: [R] Basic question on concatenating factors

From: Stavros Macrakis <macrakis_at_alum.mit.edu>
Date: Sat, 22 Nov 2008 23:43:43 -0500

On Sat, Nov 22, 2008 at 10:20 AM, jim holtman <jholtman_at_gmail.com> wrote:
> c.Factor <-
> function (x, y)
> {
> newlevels = union(levels(x), levels(y))
> m = match(levels(y), newlevels)
> ans = c(unclass(x), m[unclass(y)])
> levels(ans) = newlevels
> class(ans) = "factor"
> ans
> }

This algorithm depends crucially on union preserving the order of the elements of its arguments. As far as I can tell, the spec of union does not require this. If union were to (for example) sort its arguments then merge them (generally a more efficient algorithm), this function would no longer work.

Fortunately, the fix is simple. Instead of union, use:

     newlevels <- c(levels(x),setdiff(levels(y),levels(x))

which is guaranteed to preserve the order of levels(x).

             -s



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 23 Nov 2008 - 04:50:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 23 Nov 2008 - 05:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive