Re: [R] data order affects glmmPQL

From: Jack Tanner <ihok_at_hotmail.com>
Date: Thu 12 Jan 2006 - 12:41:34 EST


>From: Spencer Graves The correlation between the predictions from your
>two model fits is 0.95. This suggests to me that the differences between
>the two sets of answers have little practical importance, and anyone who
>disagrees may be trying to read more from the results than can actually be
>supported by the data. It should be fairly easy to select the apparent
>"best" from among several such answers being the one that had a higher
>log(likelihood). This pushes me to prefer "fit.bar" with a log(likelihood)
>of -32.31 to "fit.foo" with -33.05.
>
> I agree that the differences are somewhat disturbing, but you are
>dealing with the output from an iterative solution of a notoriously
>difficult problem, and the standard wisdom is that it is wise to try
>several sets of starting values. By modifying the order of the
>observations in the data.frame, you have effectively done that.

Spencer, thank you for setting my mind at ease. Still, I suspect there's a bug here, as the convergence procedure halts entirely when I sort the data yet another way. See
http://article.gmane.org/gmane.comp.lang.r.general/53559 .

Also, I wonder if it's appropriate to simply cherry-pick a model based on logLik, since there's no final test that of goodness of fit that happens on independent data after one has picked a model in this way.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jan 12 12:49:57 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:03 EST