[R] RE: glmmPQL questions

From: Dan Bebber <danbebber_at_forestecology.co.uk>
Date: Tue 29 Mar 2005 - 21:45:03 EST


It looks like farm is your level of replication, so you don't need to specify farm as a random factor. A linear model 'lm' with binomial errors (a.k.a. logistic regression) is enough. You only need to specify different error strata if, say, you had sampled each farm several times. Is that what you mean by 'sampling cluster'?
BUT, there is very likely some spatial dependence among farms, so you will also need to model this.
If you want to constrain the analysis, check out 'subset'. Missing values: you have to remove farms with missing values from the analysis. Look up 'na.omit'.
I think perhaps you need to consult a statistician at the Edinburgh stats department to get info on the appropriate analyses, as the R-help list is usually restricted to R-specific questions. There is a massive amount of literature on agricultural epidemiology (esp. following foot & mouth), so read up to see what has been done before.

Dan Bebber

Department of Plant Sciences
University of Oxford
South Parks Road
Oxford OX1 3RB

> Message: 4
> Date: Mon, 28 Mar 2005 12:06:25 +0100
> From: JEB Halliday <s0454869@sms.ed.ac.uk>
> Subject: [R] glmmPQL questions
> To: r-help@stat.math.ethz.ch
> Message-ID: <1112007985.4247e531657c5@sms.ed.ac.uk>
> Content-Type: text/plain; charset=ISO-8859-15
> I am looking a risk factors for disease in cattle and am
> interested in modelling
> farm and sampling cluster as random effects (My outcome is
> positive or negative
> at the level of the farm). I am using R version 2.0.1 on a Mac and have
> identified glmmPQL as hopefully the correct function to use. I have run a
> couple of models using this but was hoping that you might be able
> to answer a
> few questions.
> e.g. model<-glmmPQL(farmstatus~cattlenumber,random~1|farm,binomial)
> I am pretty new to both R and stats so if these questions are
> very simple and I
> am just missing something, suggestions about good texts on GLMM
> in R would be
> great.
> First up, what is the best way to constrain the model to only
> look at certain
> levels of a multi-level factor e.g. a categorisation of cattle
> number where all
> points of high influence
> (as determined using: summary(influence.measures(model)) )
> are confined to the largest class (D) and I want to run the model
> which just
> looks at levels A,B and C? (or only months May-September..)
> I was also wondering about the best way to force specified
> variables to remain
> in the model when running e.g. stepwise selection of interaction terms?
> Finally, is there is a recognised method for dealing with missing
> values in
> these models?
> and as a minor point the models do not run unless i specify the
> data= part of
> the syntax and as this is apparently an optional piece of
> information I was
> wondering why this is required when all of my variables are in
> the same data
> frame (and even when this data frame is attached?)
> Any help would be greatly appreciated
> Jo Halliday
> MSc student
> University of Edinburgh
> s0454869@sms.ed.ac.uk

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Mar 29 21:49:10 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:56 EST