Re: [R] glm, poisson and negative binomial distribution and confidence interval

From: Gavin Simpson <>
Date: Tue, 22 Jun 2010 12:50:35 +0100

On Mon, 2010-06-21 at 14:46 +0200, Stéphanie D'agata wrote:
> Dear list,
> I am using glm's to predict count data for a fish species inside and outside
> a marine reserve for three different methods of monitoring.
> I run glms and figured out the best model using step function for each
> methods used.
> I predicted two values for my fish counts inside and outside the reserve
> using means of each of the covariates (using predict() )
> therefore I have only one value for each protection effect (inside/outside),
> considered as my mean count.
> I used either poisson distribution or negative binomial for the models as
> for each techniques, the distribution of the counts for a same species can
> be quiet different.
> I now need to get a confidence interval for my predicted count and I want to
> compute the coefficient of variation.

I haven't seen a reply to this so I'll give it a go...

If you do predict(....., type = "response", = TRUE) you will get the predicted values on the scale of the link function plus their standard errors. You can then compute the usual confidence interval on this scale and then apply the inverse of the link function to map the confidence interval and fitted values back on to the response scale. Something like:

## assuming 'mod' is your fitted model and 'pdata' is the data frame ## containing the two rows of new values you wanted predictions for p <- predict(mod, newdata = pdata, type = "response", = TRUE) ## apprx 95% CI
upr <- with(p, fit + (2 *
lwr <- with(p, fit + (2 *
## inverse link fun
invLink <- family(mod)$linkinv
## map these on to the scale of response

fit <- with(p, invLink(fit))
upr <- invLink(upr)
lwr <- invLink(lwr)

This assumes you did the fitting using glm(). Depending on how you fitted the NB model this may or may not i) work, or ii) be appropriate, and if the NB parameter was estimated not stated a priori by you then the confidence interval will IIRC be conditional upon the estimated value.

I'm not familiar with the functions you mention below so I can't comment on those.


> It looks like the function (package NCstats) gives a confidence
> interval for a single given value
> or bspln() and bsnb() ("degreenet") gives also CI but using bootstrap. I
> therefore need a vector of counts using those latest function, which I don't
> have since I have only one predicted values from my Glms.
> My questions are the following:
> - can I use easily to get my confidence interval using my predicted
> mean? but It looks like the similar function doesn't exist for the negative
> binomial distribution?
> - in order to use the bspln and the bsnb function, can I use the covariates
> values used to get the parameters of my model to create a "predicted vector"
> and then be able to apply those functions on this vector?
> I am also not sure about the meaning of the outputs of these two functions.
> Which outputs give the CI???
> About the coefficient of variation, is it equal to the standard deviation/
> mean for all the distributions???
> Can I say that for a poisson distribution, it is therefore equal to
> 1/sqrt(mean) and for a negative binomial distribution, variance = mean +
> mean^2/theta (theta the canonical parameter given in glm.nb summary). I can
> then calculate my St. Dev and then CV?
> I really appreciate your help.
> Regards,
> ______________________________________________
> mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.

 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e]
 Gower Street, London          [w]
 UK. WC1E 6BT.                 [w]

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 22 Jun 2010 - 11:47:37 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 22 Jun 2010 - 12:20:34 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive