Re: [Rd] R-devel Digest, Vol 98, Issue 19

From: Terry Therneau <>
Date: Tue, 19 Apr 2011 07:21:12 -0500

 The replies so far have helped me see the issues more clearly. Further comments:

  1. This issue started with a bug report from a user:

fform <- as.formula(Surv(time, status) ~ age) myfun <- function(dform, ddata) {

   predict(coxph(dform, data=ddata), newdata=ddata)    }

 Gabor's suggestion to change the call is a useful idea but not completely relevant: I'm trying to make their code work.   If work-arounds are the solution, then adding model=TRUE to the coxph call is sufficient. (That is why the same code with lm() does work).

2. Duncan argues that one should not expect the construct to work. I respectfully disagree. Returning to my simpler lm example, an expression is present in a context where all the variables are known, it is a surprise that it does not work. Maybe not a surprise to the inner circle of developers, but to most users.   Looking at model.frame.lm, the final result is

        eval(fcall, environment of the formula, parent.frame()) The terms we need are in the parent frame, why does eval ignore the third argument? (I haven't looked at the R source to see if it is on purpose). Is there a way to persuade it to use that arg?

  A careful reading of ?model.frame backs up Duncan's argument: it is documented not to work. I still don't like it -- too much a reprise of the old Unix argument "That's not a bug it's a feature". (Minor note: The next to last sentence before Value has

   "containing the variables used in formula plus those specified ... Unlike model.frame.default, it returns the "   Is the ... a reminder to finish the sentence? It doesn't quite parse as is.)

3. Brian R notes that adding model=TRUE is safer. Agreed. The original S version of lm etc did not, in order to keep objects smaller and the survival code still contains that legacy --- how time changes our perceptions of "big". Should I take this as a formal suggestion to change the default in coxph and survreg? (If I further change the default to y=FALSE it will break at least 1 package (survey), and I'd guess several others.)

4. Peter D: thanks for agreeing that there is a problem.  I spent a lot of time and energy fixing model frame evaluation issues in Splus, only to be to told at the end that they didn't dare implement it "because it might break something". That made me turn my back on the whole debate and I haven't participated or kept up with the discussion. The heart of my fix then --and it did fix a lot of problems without breaking anything I could find -- was that the data= argument should be an additional place to look rather than an alternate one. mailing list Received on Tue 19 Apr 2011 - 12:28:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 19 Apr 2011 - 18:30:48 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive