Re: [Rd] [R] difference in using with() and the "data" argument in glm (PR#9338)

From: <murdoch_at_stats.uwo.ca>
Date: Fri 03 Nov 2006 - 14:12:29 GMT


I've redirected this reply from r-help to the bugs list.

On 11/3/2006 8:25 AM, vito muggeo wrote:
> Dear all,
> I am dealing with the following (apparently simple problem):
> For some reasons I am interested in passing variables from a dataframe
> to a specific environment, and in fitting a standard glm:
>
> dati<-data.frame(y=rnorm(10),x1=runif(10),x2=runif(10))
> KK<-new.env()
> for(i in 1:ncol(dati)) assign(names(dati[i]),dati[[i]],envir=KK)
> #Now the following two lines work correctly:
> coef(glm(y~x1+x2,data=KK))
> with(KK,coef(glm(y~x1+x2)))
>
> #However if I write the above code inside a function, with() does not
> appear to work..
>
> ff<-function(Formula,Data,method=1){
> KK<-new.env()
> for(i in 1:ncol(Data)) assign(names(Data[i]),Data[[i]],envir=KK)
> o<-if(method==1) glm(Formula,data=KK) else with(KK,glm(Formula))
> o}
>
> > ff(y~x1+x2,dati,1) #it works
> Call: glm(formula = Formula, data = KK)
> ..[SNIP]..
> > ff(y~x1+x2,dati,2) #it does not
> Error in eval(expr, envir, enclos) : object "y" not found
> >
>
> Could anyone to explain such difference? I believed that
> "with(data,glm(formula))" and "glm(formula,data)" were equivalent.

I think this is a bug in terms.formula. Near the end it has

     environment(terms) <- environment(x)

where x is the formula. Since "y" isn't defined in that environment, it fails. It would work for you with

     environment(terms) <- data

but see below.

A workaround that should work for you is to put

environment(Formula) <- KK

before the call to glm.

I'm not going to make the patch I suggest above, because I don't think it's consistent with the expected behaviour of glm() in the case where some of the terms in the formula are supposed to come from environment(x), and some from "data".

I don't know how to handle that case properly: I think it requires a different search strategy than R employs (but I might be wrong). This isn't a problem with the workaround I suggested to you, because there the parent of KK is environment(x), but that wouldn't be true in general.

Duncan Murdoch



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat Nov 04 01:30:34 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 03 Nov 2006 - 16:30:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.