Re: [R] package for repeated measures ANOVA with various link functions REDUX

From: Douglas Bates <bates_at_stat.wisc.edu>
Date: Wed, 05 Mar 2008 10:48:05 -0600

On Tue, Mar 4, 2008 at 9:48 PM, John Sorkin <jsorkin_at_grecc.umaryland.edu> wrote:
> Prof. Bates was correct to point out the lack of specifics in my original posting. I am looking for a package that will allow we to choose among link functions and account for repeated measures in a repeated measures ANOVA.
>
> My question is what package should I use to facilitate estimating rates of illegal drug use at three centers, and the effect two interventions have on usage. At each center data describing the rate of drug use was obtained once a month. For the first six-months, there was no intervention at any of the three centers. For months seven through 13 intervention one was applied at each of the three centers. For months 14 through 24 intervention two was applied at each center. The question I am trying to answer is did intervention one or two change drug usage at any of the three centers. I am treating center as a repeated measure, i.e. the rate of drug use at month one will be correlated with the rate of drug use at center one at months two, three, etc.

> I have accounted for repeated measures several ways in the past.

> (1) I have used SAS proc MIXED with a REPEATED statement. The REPEATED statement allows for the specification of the within-subject correlation of repeated measures by specifying the structure of the within-subject variance-covariance matrix of the repeated measures. The matrix is block diagonal with one block for each subject.

But does such a structure extend to models with binary or count responses? You have mentioned that you want to use an arbitrary link function such as quasibinomial. What I understand the effect of the REPEATED statement to be is to specify a parameterized form of the marginal variance-covariance matrix of the responses. If the response variable has a multivariate normal distribution it is possible to independently specify the mean (determined by the fixed-effects parameters) and the marginal variance-covariance.

However, in the case of generalized linear models the mean response is determined by a linear predictor and a link function while the variance-covariance of the response is determined by prior weights and a variance function. The same is true for generalized linear mixed models except that this description applies to the conditional distribution of the response given the random effects. The link and the variance functions must agree so, for example, using a logit or probit link which restricts the value of mu to the interval [0,1] would imply a variance function (up to prior weights) of mu(1-mu). At least I think so - others may feel that it is possible to specify an arbitrary variance function but I don't see how that can make sense. To me the whole point of generalized linear models is to transform the linear predictor to the desired range for the mean and to take into account what this implies about the variance.

Even if you feel that it is possible to relax the ties between the link function and the variance function I don't see how it would be possible to specify an arbitrary structure for the marginal variance-covariance of the response. If you say that the marginal variance-covariance must have a block-wise compound symmetry structure but you are going to restrict the mean to the range [0,1] I think you have painted yourself into a corner. I don't think it is possible to specify a mean on a restricted range and separately specify an arbitrary variance-covariance structure. In particular, when the mean is on the range [0,1] then you better have the variance going to zero as the mean goes to 0 or to 1. You can't arbitrarily say that the variance within a block must be constant, regardless of the values of the means in those blocks.

> (2) I have used SAS proc GENMOD which uses GEE to adjust the parameter estimates and their standard errors for the fact that a repeated measurements of a parameter are obtained from a given subjects.
>
> Is there any package in R that will allow me to perform a repeated measures ANOVA with a selection of link functions that will allow me to account for repeated measures by either specifying the correlation structure of the repeated measures from a subject a la SAS proc mixed or by adjusting the parameter estimates using GEE a la proc GENMOD? Perhaps R has a package that accounts for repeated measures in some other manner.
>
> Thank you,
> John Sorkin
>
>
>
> John Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> >>> "Douglas Bates" <bates_at_stat.wisc.edu> 3/4/2008 5:13 PM >>>
> On Tue, Mar 4, 2008 at 10:52 AM, John Sorkin
> <jsorkin_at_grecc.umaryland.edu> wrote:
> > R 2.6.0
> > Windows XP
>
> > At the risk of raising the ire of the R gods . . .
> > I am looking for a package that will allow me to perform a poisson, quasipoisson, or negative binomial regression with adjustment for repeated measures. I have looked at glm, it does not appear to allow repeated measures. Although I can't get any help for lme or lme4 I remember that those packages perform repeated measures using random effects, not repeated measures ANOVA which is what I am looking for. (By the why, how can I get help for lme4? I have tried ?lme4, help.search("lme4") etc. to no avail.)
> > A suggestion for a package that will allow for repeated measures ANOVA in the context of various link functions would be appreciated.
>
> I think you would need to be more specific about the model than just
> saying "repeated measures ANOVA". To me, "repeated measures"
> describes a structure in the data. There are many ways that one could
> model the effects of the repeated measures; some might make sense in
> the context of your data and some might not. Without further details
> about how you want to model the effect of the repeated measurements it
> would be difficult to say if you could use the lmer function in the
> lme4 package to do so.
>
> The purpose of the S language and the R implementation of that
> language is to facilitate exploration of data, including the fitting
> of models that may be appropriate - always keeping in mind George
> Box's famous statement that, "All models are wrong, but some models
> are useful". The "one size fits all" approach to data analysis - also
> known as "give me a quart and a half of statistics and just make sure
> that there is a p-value less than 5% somewhere in there" - doesn't fit
> well into the R system.
>
> Confidentiality Statement:
> This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 05 Mar 2008 - 16:51:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 05 Mar 2008 - 17:30:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive