[R] negative binomial lmer

From: Tracy Feldman <tracysfeldman_at_yahoo.com>
Date: Fri 28 Jul 2006 - 12:00:20 EST


To whom it may concern:    

  I have a question about how to appropriately conduct an lmer analysis for negative binomially distributed data. I am using R 2.2.1 on a windows machine.    

  I am trying to conduct an analysis using lmer (for non-normally distributed data and both random and fixed effects) for negative binomially distributed data. To do this, I have been using maximum likelihood, comparing the full model to reduced models (containing all but one effect, for all effects). However, for negative binomially distributed data, I need to estimate the parameter theta. I have been doing this by using a negative binomial glm of the same model (except that all the effects are fixed), and estimating mu as the fitted model like so:    

  model_1 <-glm.nb(y~x1+x2+x3, data = datafilename)   mu_1 <- fitted(model_1)
  theta_1 <- theta.ml(y, mu_1, length(data), limit = 10, eps = .Machine$double.eps^0.25, trace = FALSE)    

  Then, I conduct the lmer, using the estimated theta:    

  model_11 <-lmer(y~x1+x2+(1|x3), family = negative.binomial(theta = theta_1, link = “log”), method = “Laplace”)    

  First, I wondered if this sounds like a reasonable method to accomplish my goals.    

  Second, I wondered if the theta I use for reduced models (nested within model_11) should be estimated using a glm.nb with the same combination of variables. For example, should a glm.nb with x1 and x3 only be used to estimate theta for an lmer using x1 and x3?    

  Third, I wish to test for random effects of one categorical variable with 122 categories (effects of individual). For this variable, the glm.nb (for estimating theta) does not work--it gives this error message:   Error in get(ctr, mode = "function", envir = parent.frame())(levels(x), :

        orthogonal polynomials cannot be represented accurately enough for 122 degrees of freedom   Is there any way that will allow me to accurately estimate theta using this particular variable (or without it)? Or should I be using a Poisson distribution (lognormal?) instead, given these difficulties?    

  If anyone has advice on how to properly conduct this test (or any references that might tell me in a clear way), I would be very grateful. Also, please let me know if I should provide additional information to make my question clearer.    

  Please respond to me directly, as I am not subscribed to this list.    

  Thank you very much,    

  Tracy S. Feldman    

  Postdoctoral Associate, the Noble Foundation, Ardmore, OK.


        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Jul 28 16:09:48 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 29 Jul 2006 - 02:16:09 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.