Re: [R] unexpected GAM result - at least for me!

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Mon, 31 Mar 2008 09:30:01 -0400

On 3/31/2008 9:01 AM, Monica Pisica wrote:
> Thanks Duncan.
>
> Yes i do have variation in the lidar metrics (be, ch, crr, and home)
> although i have a quite high correlation between ch and home. But even
> if i eliminate one metric (either ch or home) i end up with a deviation
> of 99.99. The species has values of 0 and 1 since i try to predict
> presence / absence.
>
> Do you think it is still a valid result?

I repeat: look at the data. Compare the observed and predicted. That's the only way to know whether this is reasonable or not.

If you're getting reasonable predictions, then it's a valid fit. (The tests and approximations used in the reported p-values may not be at all valid. I don't know what the requirements are for those in a GAM, but if you're getting a perfect fit, then they probably aren't being met.)

Duncan Murdoch

>
> Thanks again,
>
> Monica
>
> > Date: Mon, 31 Mar 2008 08:47:48 -0400
> > From: murdoch_at_stats.uwo.ca
> > To: pisicandru_at_hotmail.com
> > CC: r-help_at_r-project.org
> > Subject: Re: [R] unexpected GAM result - at least for me!
> >
> > On 3/31/2008 8:34 AM, Monica Pisica wrote:
> > >
> > > Hi
> > >
> > >
> > > I am afraid i am not understanding something very fundamental....
> and does not matter how much i am looking into the book "Generalized
> Additive Models" of S. Wood i still don't understand my result.
> > >
> > > I am trying to model presence / absence (presence = 1, absence = 0)
> of a species using some lidar metrics (i have 4 of these). I am using
> different models and such .... and when i used gam i got this very weird
> (for me) result which i thought it is not possible - or i have no idea
> how to interpret it.
> > >
> > >> can3.gam <- gam(can>0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')
> > >> summary(can3.gam)
> > > Family: binomial
> > > Link function: logit
> > > Formula:
> > > can> 0 ~ s(be) + s(crr) + s(ch) + s(home)
> > > Parametric coefficients:
> > > Estimate Std. Error z value Pr(>|z|)
> > > (Intercept) 85.39 162.88 0.524 0.6
> > > Approximate significance of smooth terms:
> > > edf Est.rank Chi.sq p-value
> > > s(be) 1.000 1 0.100 0.751
> > > s(crr) 3.929 8 0.380 1.000
> > > s(ch) 6.820 9 0.396 1.000
> > > s(home) 1.000 1 0.314 0.575
> > > R-sq.(adj) = 1 Deviance explained = 100%
> > > UBRE score = -0.81413 Scale est. = 1 n = 148
> > >
> > > Is this a perfect fit with no statistical significance, an
> over-estimating or what???? It seems that the significance of the
> smooths terms is "null". Of course with such a model i predict perfectly
> presence / absence of species.
> > >
> > > Again, i hope you don't mind i'm asking you this. Any explanation
> will be very much appreciated.
> >
> > Look at the data. You can get a perfect fit to a logistic regression
> > model fairly easily, and it looks as though you've got one. (In fact,
> > the huge intercept suggests that all predictions will be 1. Do you
> > actually have any variation in the data?)
> >
> > Duncan Murdoch
>
>
> In a rush? Get real-time answers with Windows Live Messenger.
> <http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_realtime_042008>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 31 Mar 2008 - 13:59:16 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 01 Apr 2008 - 19:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive