Re: [R] Removing and restoring factor levels (TYPO CORRECTED)

From: Marc Schwartz (via MN) <mschwartz_at_mn.rr.com>
Date: Fri 14 Oct 2005 - 03:07:55 EST

On Thu, 2005-10-13 at 10:02 -0400, Duncan Murdoch wrote:
> Sorry, a typo in my previous message (parens in the wrong place in the
> conversion).
>
> Here it is corrected:
>
> I'm doing a big slow computation, and profiling shows that it is
> spending a lot of time in match(), apparently because I have code like
>
> x %in% listofxvals
>
> Both x and listofxvals are factors with the same levels, so I could
> probably speed this up by stripping off the levels and just treating
> them as integer vectors, then restoring the levels at the end.
>
> What is the safest way to do this? I am worried that at some point x
> and listofxvals will *not* have the same levels, and the optimization
> will give the wrong answer. So I need code that guarantees they have
> the same coding.
>
> I think this works, where "master" is a factor with the master list of
> levels (guaranteed to be a superset of the levels of x and listofxvals),
> but can anyone spot anything that might go wrong?
>
> # Strip the levels
> x <- as.integer( factor(x, levels = levels(master) ) )
>
> # Restore the levels
> x <- structure( x, levels = levels(master), class = "factor" )
>
> Thanks for any advice...
>
> Duncan Murdoch

Duncan,

With the predicate that 'master' has the full superset of all possible factor levels defined, it would seem that this would be a reasonable way to go.

This approach would also seem to eliminate whatever overhead is encountered as a result of the coercion of 'x' as a factor to a character vector, which is done by match().

One question I have is, what is the advantage of using structure() versus:

   x <- factor(x, levels = levels(master))

?

Thanks,

Marc



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Oct 14 03:15:42 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:44 EST