Re: [Rd] update.default: fall back on model.frame in case that the data frame is not in the parent environment

From: Duncan Murdoch <>
Date: Tue, 02 Aug 2011 15:06:16 -0400

On 02/08/2011 10:48 AM, Thaler,Thorn,LAUSANNE,Applied Mathematics wrote:
> > mm<- function(datf) {
> > lm(y ~ x, data = datf)
> > }
> > mydatf<- data.frame(x = rep(1:2, 10), y = rnorm(20, rep(1:2, 10)), z
> =
> > rnorm(20))
> >
> > l<- mm(mydatf)
> > update(l, . ~ . + z) # This fails, z is not found
> Good point. So let me rephrase the initial problem:
> 1.) An lm object is fitted somewhere with some data, which resides
> somewhere in the memory.
> 2.) An ideal update function would know where the original data is
> (rather than assuming that it is stored
> a.) in the parent frame
> b.) under the name given in the call slot of the lm object)
> While from my point of view assumption a.) seems to be reasonable,
> assumption b.) is kind of awkward as pointed out, because it makes it
> kind of cumbersome to update models, which were created inside a
> function (which should not be a too rare use case).
> Thus, I've to questions:
> 1.) Is it somehow possible to retrieve the original data.frame with
> which an lm is fitted just from the knowledge of the fit? I fear that
> model.frame is the best I have.

I don't think so. You can get the environment in which the formula was created from the "terms" component of the result; that's the second place lm() will look. The first place it will look is in the explicitly specified data variable, and you can get its name, but I don't think the result object necessarily stores the full "data" argument or the environment in which to look it up. (In your example, you can look up "datf" in environment(l$terms) and get it, but that wouldn't work if the formula had also been specified as an argument to mm().)

> 2.) Is there any other way of making update aware of where to look for
> the model building data?
> By the way, another work-around I was just thinking of is to use
> mm<- function(datf) {
> l<- lm(y ~ x, data = datf)
> call<- l$call
> call$data<- substitute(datf)
> l$call<- call
> l
> }
> which solves my issue (and with which I can very well live with), but I
> was wondering whether you see any chance that update could be made
> smarter? Thanks for your input.

I would suggest something simpler: return a list containing both l and datf, and pass datf to update. You can attach a class to that list to hide some of the ugliness if you like.

Duncan Murdoch mailing list Received on Tue 02 Aug 2011 - 19:10:33 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 03 Aug 2011 - 04:00:15 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive