Re: [Rd] [R] difference in using with() and the "data" argument in glm (PR#9338)

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Fri 03 Nov 2006 - 15:34:34 GMT

One thing I noticed is that ?glm does not really specify what happens if you do not give a value for data. Is data then just skipped so that search takes place in enivonrment(formula) only or is it supposed to default to something? Some clarification in ?glm would be helpful.

On 11/3/06, murdoch@stats.uwo.ca <murdoch@stats.uwo.ca> wrote:
> I've redirected this reply from r-help to the bugs list.
>
> On 11/3/2006 8:25 AM, vito muggeo wrote:
> > Dear all,
> > I am dealing with the following (apparently simple problem):
> > For some reasons I am interested in passing variables from a dataframe
> > to a specific environment, and in fitting a standard glm:
> >
> > dati<-data.frame(y=rnorm(10),x1=runif(10),x2=runif(10))
> > KK<-new.env()
> > for(i in 1:ncol(dati)) assign(names(dati[i]),dati[[i]],envir=KK)
> > #Now the following two lines work correctly:
> > coef(glm(y~x1+x2,data=KK))
> > with(KK,coef(glm(y~x1+x2)))
> >
> > #However if I write the above code inside a function, with() does not
> > appear to work..
> >
> > ff<-function(Formula,Data,method=1){
> > KK<-new.env()
> > for(i in 1:ncol(Data)) assign(names(Data[i]),Data[[i]],envir=KK)
> > o<-if(method==1) glm(Formula,data=KK) else with(KK,glm(Formula))
> > o}
> >
> > > ff(y~x1+x2,dati,1) #it works
> > Call: glm(formula = Formula, data = KK)
> > ..[SNIP]..
> > > ff(y~x1+x2,dati,2) #it does not
> > Error in eval(expr, envir, enclos) : object "y" not found
> > >
> >
> > Could anyone to explain such difference? I believed that
> > "with(data,glm(formula))" and "glm(formula,data)" were equivalent.
>
> I think this is a bug in terms.formula. Near the end it has
>
> environment(terms) <- environment(x)
>
> where x is the formula. Since "y" isn't defined in that environment, it
> fails. It would work for you with
>
> environment(terms) <- data
>
> but see below.
>
> A workaround that should work for you is to put
>
> environment(Formula) <- KK
>
> before the call to glm.
>
> I'm not going to make the patch I suggest above, because I don't think
> it's consistent with the expected behaviour of glm() in the case where
> some of the terms in the formula are supposed to come from
> environment(x), and some from "data".
>
> I don't know how to handle that case properly: I think it requires a
> different search strategy than R employs (but I might be wrong). This
> isn't a problem with the workaround I suggested to you, because there
> the parent of KK is environment(x), but that wouldn't be true in general.
>
> Duncan Murdoch
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat Nov 04 02:38:10 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 03 Nov 2006 - 16:30:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.