Re: [Rd] Inconsistency in for stringsAsFactors

From: Terry Therneau <>
Date: Mon, 25 Jan 2010 09:36:19 -0600

Kudos to Peter for actually answering the question of why the inconsistency was there. It might be well to add a bit to the documentation.

  As to the larger discussion of global defaults let me offer two opinions:

  1. They are the salvation of those of us who do not agree with certain global defaults.
    • 'best practice' is not always a consensus
    • defaults are often informed too much by "the data we happened to be analyising when we decided the default". The long-standing contrast.helmert one for instance; a look at the white book shows that they were working on an orthagonal manufacturing design, the one case where Helmert contrasts make sense. The survival package contains several defaults with the same type of origin.
  2. People in these discussions play the "it might break something" card far too often. At Mayo, for instance, the table() command has been replaced by one which lists NA by default, for all data types. We've done this for as long as R and Splus have been used (10+ years), for all 150 people in the biostat group, and nothing has broken yet. A suggestion to allow this as a global default will immediately elicit the above argument, I guarrantee it. Ditto for our experience with stringsAsFactors=FALSE; nothing's broken yet. Give a concrete example before crying wolf.

Terry T mailing list Received on Mon 25 Jan 2010 - 15:42:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Jan 2010 - 20:00:17 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive