After reading that email four times, I think I see what you mean.

I think this is a new problem, and I'll switch to the lengthier data.frame[,'x'] syntax in gam for now.

By the way, about the $ accessors. I can see why some people don't like them, but they are a part of the language. And, I think they're a good part. They make the code much more readable, and I have yet to make a mistake using them (which makes me think that they can be used responsibly). Making code harder to read is its own source of error, and is not something that I take lightly!

Thanks for the replies. And, thank you Rolf for the detailed analysis. Do you think that your or I should submit a bug report to the package maintainer? I'm not sure how that works. Very few of my challenges turn out to be actual bugs, but I think this one is.

On Fri, May 6, 2011 at 4:53 AM, Berwin A Turlach <Berwin.Turlach_at_gmail.com>wrote:

> G'day Rolf,

*>
**> On Fri, 06 May 2011 09:58:50 +1200
**> Rolf Turner <rolf.turner_at_xtra.co.nz> wrote:
**>
**> > but it's strange that the dodgey code throws an error with gam(dat1$y
**> > ~ s(dat1$x)) but not with gam(dat2$cf ~ s(dat2$s))
**>
**> > Something a bit subtle is going on; it would be nice to be able to
**> > understand it.
**>
**> Well,
**>
**> R> traceback()
**> 3: eval(expr, envir, enclos)
**> 2: eval(inp, data, parent.frame())
**> 1: gam(dat$y ~ s(dat$x))
**>
**> So the lines leading up to the problem seem to be the following from
**> the gam() function:
**>
**> vars <- all.vars(gp$fake.formula[-2])
**> inp <- parse(text = paste("list(", paste(vars, collapse = ","),
**> ")"))
**> if (!is.list(data) && !is.data.frame(data))
**> data <- as.data.frame(data)
**>
**>
**>
**> Setting
**>
**> R> options(error=recover)
**>
**> running the code until the error occurs, and then examining the frame
**> number for the gam() call shows that "inp" is
**> "expression(list( dat1,x ))" in your first example and
**> "expression(list( dat2,s ))" in your second example. In both
**> examples, "data" is "list()" (not unsurprisingly). When,
**>
**> dl <- eval(inp, data, parent.frame())
**>
**> is executed, it tries to eval "inp", in both cases "dat1" and "dat2"
**> are found, obviously, in the parent frame. In your first example "x" is
**> (typically) not found and an error is thrown, in your second example an
**> object with name "s" is found in "package:mgcv" and the call to eval
**> succeeds. "dl" becomes a list with two components, the first being,
**> respectively, "dat1" or "dat2", and the second the body of the function
**> "s". (To verify that, you should probably issue the command
**> "debug(gam)" and step through those first few lines of the function
**> until you reach the above command.)
**>
**> The corollary is that you can use the name of any object that R will
**> find in the parent frame, if it is another data set, then that data
**> set will become the second component of "inp". E.g.:
**>
**> R> dat=data.frame(min=1:100,cf=sin(1:100/50)+rnorm(100,0,.05))
**> R> gam(dat$cf ~ s(dat$min))
**>
**> Family: gaussian
**> Link function: identity
**>
**> Formula:
**> dat$cf ~ s(dat$min)
**>
**> Estimated degrees of freedom:
**> 3.8925 total = 4.892488
**>
**> GCV score: 0.002704789
**>
**> Or
**>
**> R> dat=data.frame(BOD=1:100,cf=sin(1:100/50)+rnorm(100,0,.05))
**> R> gam(dat$cf ~ s(dat$BOD))
**>
**> Family: gaussian
**> Link function: identity
**>
**> Formula:
**> dat$cf ~ s(dat$BOD)
**>
**> Estimated degrees of freedom:
**> 3.9393 total = 4.939297
**>
**> GCV score: 0.002666985
**>
**> > Just out of pure academic interest. :-)
**>
**> Hope your academic curiosity is now satisfied. :)
**>
**> HTH.
**>
**> Cheers,
**>
**> Berwin
**>
**>
