Re: [R] unexpected GAM result - at least for me!

From: Daniel Malter <daniel_at_umd.edu>
Date: Wed, 02 Apr 2008 07:30:14 -0400


You may want to plot your smooth terms:

plot(can3.gam,residuals=TRUE,pch=1).

The 7 and 4 estimated degrees of freedom on the two middle terms can give you a quite curvy smooth term, and you might overfit the data (as mentioned before by somebody else). Also, you may want to look at the correlation between the smoothing variables. Compute the correlation matrix as a first step and plot each of the variables against the others, which better allows you identifying nonlinear dependencies. If one of these relationships is nearly perfect you may face serious issues due to multicollinearity.

I am sorry if I am doubling somebody's earlier response.

Cheers,
Daniel



cuncta stricte discussurus

-----Ursprüngliche Nachricht-----
Von: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] Im Auftrag von Monica Pisica
Gesendet: Tuesday, April 01, 2008 2:44 PM An: Duncan Murdoch
Cc: r-help_at_r-project.org
Betreff: Re: [R] unexpected GAM result - at least for me!

Hi,

I've compared observed and predicted and they match 100%.

For 90% probability of occurrence:

table(can>0,fitted(can3.gam)>0.9)

        FALSE TRUE   FALSE 23 0   TRUE 0 125 So i guess it is a valid result ..... but very unexpected for me.

Thank you again for all the help,

Monica

> Date: Mon, 31 Mar 2008 09:30:01 -0400
> From: murdoch_at_stats.uwo.ca
> To: pisicandru_at_hotmail.com
> CC: r-help_at_r-project.org
> Subject: Re: [R] unexpected GAM result - at least for me!
>
> On 3/31/2008 9:01 AM, Monica Pisica wrote:
>> Thanks Duncan.
>>
>> Yes i do have variation in the lidar metrics (be, ch, crr, and home)
>> although i have a quite high correlation between ch and home. But
>> even if i eliminate one metric (either ch or home) i end up with a
>> deviation of 99.99. The species has values of 0 and 1 since i try to
>> predict presence / absence.
>>
>> Do you think it is still a valid result?
>
> I repeat: look at the data. Compare the observed and predicted. That's
> the only way to know whether this is reasonable or not.
>
> If you're getting reasonable predictions, then it's a valid fit. (The
> tests and approximations used in the reported p-values may not be at
> all valid. I don't know what the requirements are for those in a GAM,
> but if you're getting a perfect fit, then they probably aren't being
> met.)
>
> Duncan Murdoch
>
>
>>
>> Thanks again,
>>
>> Monica
>>
>>> Date: Mon, 31 Mar 2008 08:47:48 -0400
>>> From: murdoch_at_stats.uwo.ca
>>> To: pisicandru_at_hotmail.com
>>> CC: r-help_at_r-project.org
>>> Subject: Re: [R] unexpected GAM result - at least for me!
>>>
>>> On 3/31/2008 8:34 AM, Monica Pisica wrote:
>>>>
>>>> Hi
>>>>
>>>>
>>>> I am afraid i am not understanding something very fundamental....
>> and does not matter how much i am looking into the book "Generalized
>> Additive Models" of S. Wood i still don't understand my result.
>>>>
>>>> I am trying to model presence / absence (presence = 1, absence = 0)
>> of a species using some lidar metrics (i have 4 of these). I am using
>> different models and such .... and when i used gam i got this very
>> weird (for me) result which i thought it is not possible - or i have
>> no idea how to interpret it.
>>>>
>>>>> can3.gam <- gam(can>0~s(be)+s(crr)+s(ch)+s(home), family =
>>>>> 'binomial')
>>>>> summary(can3.gam)
>>>> Family: binomial
>>>> Link function: logit
>>>> Formula:
>>>> can> 0 ~ s(be) + s(crr) + s(ch) + s(home)
>>>> Parametric coefficients:
>>>> Estimate Std. Error z value Pr(>|z|)
>>>> (Intercept) 85.39 162.88 0.524 0.6
>>>> Approximate significance of smooth terms:
>>>> edf Est.rank Chi.sq p-value
>>>> s(be) 1.000 1 0.100 0.751
>>>> s(crr) 3.929 8 0.380 1.000
>>>> s(ch) 6.820 9 0.396 1.000
>>>> s(home) 1.000 1 0.314 0.575
>>>> R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413
>>>> Scale est. = 1 n = 148
>>>>
>>>> Is this a perfect fit with no statistical significance, an
>> over-estimating or what???? It seems that the significance of the
>> smooths terms is "null". Of course with such a model i predict
>> perfectly presence / absence of species.
>>>>
>>>> Again, i hope you don't mind i'm asking you this. Any explanation
>> will be very much appreciated.
>>>
>>> Look at the data. You can get a perfect fit to a logistic regression
>>> model fairly easily, and it looks as though you've got one. (In
>>> fact, the huge intercept suggests that all predictions will be 1. Do
>>> you actually have any variation in the data?)
>>>
>>> Duncan Murdoch
>>
>>
>> In a rush? Get real-time answers with Windows Live Messenger.
>>
>


esh_instantaccess_042008



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 02 Apr 2008 - 11:33:37 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 02 Apr 2008 - 12:30:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive