Re: [R] Is this an artifact of using "which"?

From: <Richard.Cotton_at_hsl.gov.uk>
Date: Mon, 14 Apr 2008 15:37:55 +0100


> > The (imho) unintuitive behaviour is to do with the subsetting function

> > [.factor, not which. There are a couple of workarounds:
> >
> In that case, your intuition needs readjustment....
>
> There are other systems which (de facto) drop unused levels by default,
> and it is a real pain to work around, especially for subgroup analyses.
> E.g. there is no way to get PROC FREQ in SAS to include a count of zero,
> and barplots of ratings fro 0 to 10 lose columns "randomly" in SPSS
> (this _can_ be worked around, though).
>
> Anyways, it is illogical: There's no reason that a tabulation of gender
> distribution for (say) tenured CS professors should suddenly pretend
> that the female gender does not exist!

I didn't mean to be a troll, and I can certainly see the virtue in preserving levels for the cases as you described, but it was something that caught me out me when I first learned R. Having the levels of a factor as "the values that my categorical data takes", rather than "the _possible_ values that my categorical data takes" was more natural to me. The important thing is that it is possible to include or drop the unused levels easily as required.

Btw, has the behaviour of the drop argument to '[' changed recently? I seem to remember that drop=TRUE didn't remove unused factor levels in older versions, though my memory may be mistaken.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 14 Apr 2008 - 15:08:25 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 Apr 2008 - 15:30:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive