# Re: [R] dreaded p-val for d^2 of a glm / gam

From: Monica Pisica <pisicandru_at_hotmail.com>
Date: Fri, 28 Mar 2008 12:59:07 +0000

Thanks for your answers .... first - yes it is deviance but just before i just spoke and explain that it is the equivalent of r square from the "normal" regression.....

I hope i can do the comparison and show that the model is significant and hopefully i am off the hook. Sincerely i try to avoid all this business with p-values but certainly some are quite found of it. The problem is that you get almost by default a p-value from an F test if you use lm for example, so ..... quite few times i was asked to provide a similar thing for quite different models.

Thanks again,

```Monica> Date: Thu, 27 Mar 2008 16:38:12 -0700> From: spencer.graves@pdf.com> To: pisicandru@hotmail.com> CC: r-help@r-project.org> Subject: Re: [R] dreaded p-val for d^2 of a glm / gam> > I assume you mean 'deviance', not 'squared deviance'; if the > latter, then I have no idea. > > If the former, then a short and fairly quick answer to your > question is that 2*log(likelihood ratio) for nested hypotheses is > approximately chi-square with numbers of degrees of freedom = the number > of parameters in the larger model fixed to get the smaller model, under > standard regularity conditions, the most important of which is that the > maximum likelihood is not at a boundary. > > For specificity, consider the following modification of the first > example in the 'glm' help page: > > counts <- c(18,17,15,20,10,20,25,13,12)> outcome <- gl(3,1,9)> treatment <- gl(3,3)> glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())> glm.D93t <- glm(counts ~ treatment, family=poisson())> anova(glm.D93t, glm.D93, test="Chisq")> > The p-value is not printed by default, because some people would > rather NOT give an answer than give an answer that might not be very > accurate in the cases where this chi-square approximation is not very > good. To check that, you could do a Monte Carlo, refit the model with, > say, 1000 random permutations of your response variable, collect > anova(glm.D93t, glm.D93)[2, "Deviance"] in a vector, and then find out > how extreme the deviance you actually got is relative to this > permutation distribution. > > Hope this helps. > Spencer Graves> p.s. Regarding your 'dread', please see fortune("children")> > Monica Pisica wrote:> > OK,> >> > I really dread to ask that .... much more that I know some discussion about p-values and if they are relevant for regressions were already on the list. I know to get p-val of regression coefficients - this is not a problem. But unfortunately one editor of a journal where i would like to publish some results insists in giving p-values for the squared deviance i get out from different glm and gam models. I came up with this solution, but sincerely i would like to get yours'all opinion on the matter.> >> > p1.glm <- glm(count ~be+ch+crr+home, family = 'poisson')> >> > # count - is count of species (vegetation)> > # be, ch, crr, home - different lidar metrics> >> > # calculating d^2> > d2.p1 <- round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4)> > d2.p1> > 0.6705> >> > # calculating f statistics with N = 148 and n=4; f = (N-n-1)/(N-1)(1-d^2)> > f <- (148-4-1)/(147*(1-0.6705))> > f> > [1] 2.952319> >> > #calculating p-value> > pval.glm <- 1-pf(f, 147,143)> > pval.glm> > [1] 1.135693e-10> >> > So, what do you think? Is this acceptable if i really have to give a p-value for the deviance squared? If it is i think i will transform everything in a fuction ....> >> > Thanks,> >> > Monica> > _________________________________________________________________> > Windows Live Hotmail is giving away Zunes.> >> > M_Mobile_Zune_V3> > ______________________________________________> > R-help@r-project.org mailing list> > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.> >
_________________________________________________________________
```