Re: [R] dreaded p-val for d^2 of a glm / gam

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Fri, 28 Mar 2008 07:49:21 -0700

      Do you think p values are bad? That's not my understanding. P values may not be reported by some software, because the algorithm developers didn't know how to efficiently compute a reasonably accurate p value. And if you do a thousand or a million statistical tests and report only the results with the smallest p value, that's ultimately fraudulent. (http://en.wikipedia.org/wiki/Multiple_comparisons).

      However, the concept of a significance probability or p value is quite valuable (http://en.wikipedia.org/wiki/P-value), though like any tool or concept, it can be misused.

      Hope this helps. 
      Spencer Graves

Monica Pisica wrote:
> Hi Spencer and David,
>
> Thanks for your answers .... first - yes it is deviance but just
> before i just spoke and explain that it is the equivalent of r square
> from the "normal" regression.....
>
> I hope i can do the comparison and show that the model is significant
> and hopefully i am off the hook. Sincerely i try to avoid all this
> business with p-values but certainly some are quite found of it. The
> problem is that you get almost by default a p-value from an F test if
> you use lm for example, so ..... quite few times i was asked to
> provide a similar thing for quite different models.
>
> Thanks again,
>
> Monica
>
> > Date: Thu, 27 Mar 2008 16:38:12 -0700
> > From: spencer.graves_at_pdf.com
> > To: pisicandru_at_hotmail.com
> > CC: r-help_at_r-project.org
> > Subject: Re: [R] dreaded p-val for d^2 of a glm / gam
> >
> > I assume you mean 'deviance', not 'squared deviance'; if the
> > latter, then I have no idea.
> >
> > If the former, then a short and fairly quick answer to your
> > question is that 2*log(likelihood ratio) for nested hypotheses is
> > approximately chi-square with numbers of degrees of freedom = the
> number
> > of parameters in the larger model fixed to get the smaller model, under
> > standard regularity conditions, the most important of which is that the
> > maximum likelihood is not at a boundary.
> >
> > For specificity, consider the following modification of the first
> > example in the 'glm' help page:
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
> > glm.D93t <- glm(counts ~ treatment, family=poisson())
> > anova(glm.D93t, glm.D93, test="Chisq")
> >
> > The p-value is not printed by default, because some people would
> > rather NOT give an answer than give an answer that might not be very
> > accurate in the cases where this chi-square approximation is not very
> > good. To check that, you could do a Monte Carlo, refit the model with,
> > say, 1000 random permutations of your response variable, collect
> > anova(glm.D93t, glm.D93)[2, "Deviance"] in a vector, and then find out
> > how extreme the deviance you actually got is relative to this
> > permutation distribution.
> >
> > Hope this helps.
> > Spencer Graves
> > p.s. Regarding your 'dread', please see fortune("children")
> >
> > Monica Pisica wrote:
> > > OK,
> > >
> > > I really dread to ask that .... much more that I know some
> discussion about p-values and if they are relevant for regressions
> were already on the list. I know to get p-val of regression
> coefficients - this is not a problem. But unfortunately one editor of
> a journal where i would like to publish some results insists in giving
> p-values for the squared deviance i get out from different glm and gam
> models. I came up with this solution, but sincerely i would like to
> get yours'all opinion on the matter.
> > >
> > > p1.glm <- glm(count ~be+ch+crr+home, family = 'poisson')
> > >
> > > # count - is count of species (vegetation)
> > > # be, ch, crr, home - different lidar metrics
> > >
> > > # calculating d^2
> > > d2.p1 <- round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4)
> > > d2.p1
> > > 0.6705
> > >
> > > # calculating f statistics with N = 148 and n=4; f =
> (N-n-1)/(N-1)(1-d^2)
> > > f <- (148-4-1)/(147*(1-0.6705))
> > > f
> > > [1] 2.952319
> > >
> > > #calculating p-value
> > > pval.glm <- 1-pf(f, 147,143)
> > > pval.glm
> > > [1] 1.135693e-10
> > >
> > > So, what do you think? Is this acceptable if i really have to give
> a p-value for the deviance squared? If it is i think i will transform
> everything in a fuction ....
> > >
> > > Thanks,
> > >
> > > Monica
> > > _________________________________________________________________
> > > Windows Live Hotmail is giving away Zunes.
> > >
> > > M_Mobile_Zune_V3
> > > ______________________________________________
> > > R-help_at_r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
>
> ------------------------------------------------------------------------
> Watch “Cause Effect,” a show about real people making a real
> difference. Learn more.
> <http://im.live.com/Messenger/IM/MTV/?source=text_watchcause>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 28 Mar 2008 - 14:51:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 28 Mar 2008 - 15:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive