Re: [R] Overdispersion in count data

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Thu, 03 Apr 2008 14:11:12 +0100

On Thu, 2008-04-03 at 01:24 +0000, David Winsemius wrote:
> "Wade Wall" <wade.wall@gmail.com> wrote in
> news:e23082be0804021244q57e359e8ic724b6f619f90153_at_mail.gmail.com:
>
> > Thanks for the recommendations, insights. I tried using glm.nb, but
> > it didn't seem to like my data. I received the message (subscript)
> > logical subscript too long. I am using the same dataframe as my
> > previous glm. Do you know if I need to put the data in a different
> > format?
>
> I was wondering about your data layout. You said you had the flower/no-
> flower data in two different columns. That is not the way I usually
> offer data to glm(). I would have imagined that log(burn_time) would
> have been an offset. It might help if you at least offered the audience
> a sample of ten rows, the results of str() for the data.frame, and the
> call to the glm function.
>

David, Wade,

You can supply a two-column matrix to glm() with families (quasi)binomial as the two columns represent successes and failures respectively; see ?glm and Details section.

However this is not possible with a (quasi)poisson family or with glm.nb, which is one reason why Wade might have been getting the error. My fault here - I didn't grep exactly what Wade had written. Even if this isn't causing the error, Wade won't be fitting the model he thought he was with a two column response in glm.nb.

In a similar vein to David's suggestion, could one not use offset but on the total number of flowers sampled in each location, e.g. offset(log(totalPlants)) in the formula where totalPlants is the variable containing the total number of plants encountered. You need to include this in the formula for glm.nb as that function does not have an argument 'offset'. Wade's y is then just a vector of counts of flowering individuals. In this way one would account for differences in the number of flowers encountered between sites.

HTH G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 03 Apr 2008 - 13:14:35 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 Apr 2008 - 15:00:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive