Re: [R] glmmPQL "error" message (was 'data order affects glmmPQL')

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Thu 12 Jan 2006 - 14:03:08 EST

  1. The function "glmmPQL" is in the MASS package, as can be seen by looking at the top line in the help file for "glmmPQL". To find the maintainer, type 'help(package="MASS")'. The results say, "Maintainer: Brian Ripley <ripley@stats.ox.ac.uk>".
  2. It is generally NOT "appropriate to simply cherry-pick a model based on logLik", as you suggested. However, your example does NOT involve this issue, because you are making multiple attempts to fit the same model to the same data set. With any iterative algorithm, it is considered legitimate to try fitting the same model with the same data with different starting values and select the one with the largest log(likelihood), considering that all others had not adequately converged. In this case, the algorithm runs and produces similar but different answers when the order is changed. Since the model does not seem to consider anything that would theoretically be affected by the sort order, it seems to me that this is crudely equivalent to changing the starting values, as I mentioned before. Therefore, I would consider it quite legitimate to pick the fit with the highest logLik.
  3. I agree it is disturbing when glmmPQL generates "Error in lme.formula(fixed = zz ~ test + coder, random = ~1 | id, data = list( : false convergence (8)". If it were my problem, I might make local compies of glmmPQL and lme.formula and trace through the code line by line using "debug" until I developed an idea about how I might change the code to get it past this error and on to something close to convergence.
	  Hope this helps.
	  spencer graves

Jack Tanner wrote:

>> From: Spencer Graves The correlation between the predictions
>> from your two model fits is 0.95. This suggests to me that the
>> differences between the two sets of answers have little practical
>> importance, and anyone who disagrees may be trying to read more from
>> the results than can actually be supported by the data. It should be
>> fairly easy to select the apparent "best" from among several such
>> answers being the one that had a higher log(likelihood). This pushes
>> me to prefer "fit.bar" with a log(likelihood) of -32.31 to "fit.foo"
>> with -33.05.
>>
>> I agree that the differences are somewhat disturbing, but you
>> are dealing with the output from an iterative solution of a
>> notoriously difficult problem, and the standard wisdom is that it is
>> wise to try several sets of starting values. By modifying the order
>> of the observations in the data.frame, you have effectively done that.
>
>
> Spencer, thank you for setting my mind at ease. Still, I suspect there's
> a bug here, as the convergence procedure halts entirely when I sort the
> data yet another way. See
> http://article.gmane.org/gmane.comp.lang.r.general/53559 .
>
> Also, I wonder if it's appropriate to simply cherry-pick a model based
> on logLik, since there's no final test that of goodness of fit that
> happens on independent data after one has picked a model in this way.
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jan 12 14:11:05 2006

This archive was generated by hypermail 2.1.8 : Thu 12 Jan 2006 - 18:23:15 EST