Re: [R] glm and percentage data with many zero values

From: Tony Plate <>
Date: Wed 09 Mar 2005 - 09:18:16 EST

A very quick and easy thing to do with count data is to add 1 (or 0.5) to all your counts (I'm sure you can work backwards from abundance data to counts and then forward again). This gets rid of zero problems. In some cases this approximates a Bayesian approach with a low-information prior (though I'm not at all sure whether this is the case with a glm with Poisson errors).

At Wednesday 08:02 AM 4/20/2005, Christian Kamenik wrote:
>Dear all,
>I am interested in correctly testing effects of continuous environmental
>variables and ordered factors on bacterial abundance. Bacterial abundance
>is derived from counts and expressed as percentage. My problem is that the
>abundance data contain many zero values:
>Bacteria <-

>First I tried transforming the data (e.g., logit) but because of the zeros
>I was not satisfied. Next I converted the percentages into integer values
>by round(Bacteria*10) or ceiling(Bacteria*10) and calculated a glm with a
>Poisson error structure; however, I am not very happy with this approach
>because it changes the original percentage data substantially (e.g., 0.03
>becomes either 0 or 1). The same is true for converting the percentages
>into factors and calculating a multinomial or proportional-odds model
>(anyway, I do not know if this would be a meaningful approach).
>I was searching the web and the best answer I could get was
> in
>which several persons suggested quasi-likelihood. Would it be reasonable
>to use a glm with quasipoisson? If yes, how I can I find the appropriate
>variance function? Any other suggestions?
>Many thanks in advance, Christian
>Christian Kamenik
>Institute of Plant Sciences
>University of Bern
>Altenbergrain 21
>3013 Bern
> mailing list
>PLEASE do read the posting guide! mailing list PLEASE do read the posting guide! Received on Wed Mar 09 09:22:48 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:41 EST