Re: [Rd] Inconsistency in as.data.frame.table for stringsAsFactors

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Sat, 23 Jan 2010 12:12:54 +0100

Stavros Macrakis wrote:
> Martin,
>
> I agree that global options settings that affect computations are
> problematic.
>
> But that's not the issue I was addressing. If for some classes, func.CLASS
> has certain defaults for some arguments, it is surprising that for other
> classes, it has different defaults, whether these defaults are fixed or
> taken from global settings -- when there is no obvious reason for the
> default to vary by class.
>
> -s

"A foolish consistency is the hobgoblin of little minds..."

The thing is that if you are converting the classifying factors of a table to columns of a data frame, you will presumably prefer that they come out as factors, retaining level order. The alternative is like this:

 > (x <- as.table(c("Rare"=5, "Medium"=2, "Well-done"=6)))

      Rare    Medium Well-done
         5         2         6

 > df <- as.data.frame(x, stringsAsFactors=F)  > xtabs(Freq~Var1, data=df)
Var1
    Medium      Rare Well-done
         2         5         6

This is completely different from other cases, where as.data.frame will auto-convert character variables to factors; e.g., on reading. Having a global option intended for read.table() interfere with the above kind of operation, could be a really nasty surprise for the user. (Notice also that the option was introduced in 2.10.0, before then, noone would expect that classifying factors could come out as non-factors. Defaulting to the global option could easily break working code.)

-- 
    O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat 23 Jan 2010 - 11:15:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 23 Jan 2010 - 14:30:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive