# Re: [R] Conservative

From: Douglas Bates <bates_at_stat.wisc.edu>
Date: Thu 14 Sep 2006 - 14:32:46 GMT

On 14 Sep 2006 12:21:28 +0200, Peter Dalgaard <p.dalgaard@biostat.ku.dk> wrote:
> Gregor Gorjanc <gregor.gorjanc@bfro.uni-lj.si> writes:
>
> > Douglas Bates <bates <at> stat.wisc.edu> writes:
> >
> > > On 9/13/06, Dimitris Rizopoulos <dimitris.rizopoulos <at> med.kuleuven.be>
> > > > > I believe that the LRT is anti-conservative for fixed effects, as
> > > > > described in Pinheiro and Bates companion book to NLME.
> > > > >
> > > > You have this effect if you're using REML, for ML I don't think there
> > > > is any problem to use LRT between nested models with different
> > > > fixed-effects structure.
> > ...
> > > The other question is how does one evaluate the likelihood-ratio test
> > > statistic and that is the issue that Dimitris is addressing. The REML
> > > criterion is a modified likelihood and it is inappropriate to look at
> > > differences in the REML criterion when the models being compared have
> > > different fixed-effects specifications, or even a different
> > > parameterization of the fixed effects. However, the anova method for
> > > an lmer object does not use the REML criterion even when the model has
> > > been estimated by REML. It uses the profiled log-likelihood evaluated
> > > at the REML estimates of the relative variances of the random effects.
> > > That's a complicated statement so let me break it down.
> > ...
> >
> > Is this then the same answer as given by Robinson:1991 (ref at the end) to
> > question by Robin Thompson on which likelihood (ML or REML) should be used
> > in testing the "fixed" effects. Robinson answered (page 49 near bottom
> > right) that both likelihoods give the same conclusion about fixed effects.
> > Can anyone comment on this issues?
>
> At the risk of sticking my foot in it due to not reading the paper
> carefully enough: There appears to be two other likelihoods in play,
> one traditional one depending on fixed effects and variances and
> another depending on fixed effects and BLUPs ("most likely
> unobservables"). I think Robinson is talking about the equivalence of
> those two.
>
> (and BTW ss=Statistical Science in the ref.)
> >
> > @Article{Robinson:1991,
> > author = {Robinson, G. K.},
> > title = {That {BLUP} is a good thing: the estimation of random
> > effects},
> > journal = ss,
> > year = {1991},
> > volume = {6},
> > number = {1},
> > pages = {15--51},
> > keywords = {BLUP, example, derivations, links, applications},
> > vnos = {GG}
> > }

The issue that I was attempting to describe is somewhat different. I'm only considering using the log-likelihood values for the likelihood ratio test (which, BTW, I would not recommend for testing terms in the fixed-effects specification but that's another discussion). To perform such a test one should evaluate the log-likelihood for the models being compared, which is to say you need the minimum of the log-likelihood for each model over that model's parameter space. If you estimated parameters in a model by REML (the default criterion) then all you know is that the log-likelihood for the model is bounded above by the log-likelihood for the model evaluated at those parameter estimates. To be able to perform the test properly you should evaluate the ML estimates and take the log-likelihood at those estimates. For small data sets and simple models this wouldn't be a problem. For large data sets with complex model structures this could take some time.

However, the profiled log-likelihood evaluated at the REML estimates of the relative variances of the random effects (and the corresponding conditional ML estimates of $\sigma^2$ and the fixed-effects) is very close to the log-likelihood for the model so I take a short cut and use that as the log-likelihood for the model.

So the quantities labelled "MLdeviance" and "REMLdeviance" in the show() or summary() output of an lmer model can be interpreted as -2*log-likelihood and -2*log-restricted-likelihood for the model not for the particular parameters being reported. If you estimated the parameters by REML then the REMLdeviance is accurate and the MLdeviance will be a slight over-estimate. If you estimated the parameters by ML you get the opposite situation (accurate MLdeviance, slight over-estimate of REMLdeviance). The rest of the output shows the parameter estimates for the criterion that you used to estimate the model. The parameter estimates for the other criterion could be quite different.

Here's a rather extreme example using the PBIB data from the SASmixed package.

> (fm1 <- lmer(response ~ Treatment + (1|Block), PBIB, method = "REML"))
Linear mixed-effects model fit by REML
Formula: response ~ Treatment + (1 | Block)

Data: PBIB

     AIC      BIC    logLik MLdeviance REMLdeviance
83.9849 117.4944 -25.99245   22.82831     51.98489
Random effects:
Groups   Name        Variance Std.Dev.
Block    (Intercept) 0.046522 0.21569
Residual             0.085559 0.29250


number of obs: 60, groups: Block, 15

Fixed effects:

             Estimate Std. Error t value
(Intercept)  2.817523   0.166413 16.9309
Treatment10 -0.326461   0.222061 -1.4701
Treatment11  0.081177   0.227200  0.3573
Treatment12  0.235299   0.222061  1.0596
Treatment13 -0.199753   0.222061 -0.8995
Treatment14 -0.326211   0.222061 -1.4690
Treatment15  0.041711   0.222061  0.1878
Treatment2  -0.412208   0.222061 -1.8563
Treatment3  -0.362579   0.222061 -1.6328
Treatment4  -0.033692   0.222061 -0.1517
Treatment5  -0.012625   0.222061 -0.0569
Treatment6   0.093171   0.227200  0.4101
Treatment7  -0.028537   0.222061 -0.1285
Treatment8  -0.035917   0.222061 -0.1617
Treatment9   0.073789   0.222061  0.3323



> (fm1ML <- lmer(response ~ Treatment + (1|Block), PBIB, method = "ML"))
Linear mixed-effects model fit by maximum likelihood Formula: response ~ Treatment + (1 | Block)

Data: PBIB

      AIC      BIC    logLik MLdeviance REMLdeviance
54.57058 88.08009 -11.28529   22.57058     52.20403
Random effects:
Groups   Name        Variance Std.Dev.
Block    (Intercept) 0.045030 0.21220
Residual             0.060371 0.24571


number of obs: 60, groups: Block, 15

Fixed effects:

             Estimate Std. Error t value
(Intercept)  2.822676   0.143563 19.6615
Treatment10 -0.327070   0.187925 -1.7404
Treatment11  0.067533   0.192717  0.3504
Treatment12  0.222437   0.187925  1.1836
Treatment13 -0.195125   0.187925 -1.0383
Treatment14 -0.323562   0.187925 -1.7218
Treatment15  0.034576   0.187925  0.1840
Treatment2  -0.416171   0.187925 -2.2146
Treatment3  -0.368036   0.187925 -1.9584
Treatment4  -0.057737   0.187925 -0.3072
Treatment5  -0.017386   0.187925 -0.0925
Treatment6   0.086646   0.192717  0.4496
Treatment7  -0.037247   0.187925 -0.1982
Treatment8  -0.035441   0.187925 -0.1886
Treatment9   0.076438   0.187925  0.4067



You can see that the MLdeviance at the ML estimates is lower than at the REML estimates but not a lot lower. Other quantities like the estimate of $\sigma^2$ and the standard errors of the fixed effects do change considerably between the two sets of estimates.

This example also shows why using likelihood ratio tests for terms in the fixed-effects specification is not a good idea. A likelihood ratio test for the Treatment effect would be either
> (fm2ML <- lmer(response ~ 1 + (1|Block), PBIB, method = "ML"))
Linear mixed-effects model fit by maximum likelihood Formula: response ~ 1 + (1 | Block)

Data: PBIB

      AIC      BIC    logLik MLdeviance REMLdeviance
50.15189 54.34058 -23.07595   46.15189     49.53125
Random effects:
Groups   Name        Variance Std.Dev.
Block    (Intercept) 0.057544 0.23988
Residual             0.092444 0.30405


number of obs: 60, groups: Block, 15

Fixed effects:

Estimate Std. Error t value
(Intercept) 2.736667 0.073328 37.321
> anova(fm2ML, fm1ML)

Data: PBIB
Models:
fm2ML: response ~ 1 + (1 | Block)
fm1ML: response ~ Treatment + (1 | Block)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm2ML 2 50.152 54.341 -23.076
fm1ML 16 54.571 88.080 -11.285 23.581 14 0.05144 .

or

> (fm2 <- lmer(response ~ 1 + (1|Block), PBIB))
Linear mixed-effects model fit by REML
Formula: response ~ 1 + (1 | Block)

Data: PBIB

      AIC      BIC    logLik MLdeviance REMLdeviance
53.50553 57.69422 -24.75277   46.17836     49.50553
Random effects:
Groups   Name        Variance Std.Dev.
Block    (Intercept) 0.063306 0.25161
Residual             0.092444 0.30405


number of obs: 60, groups: Block, 15

Fixed effects:

Estimate Std. Error t value
(Intercept) 2.736667 0.075902 36.055
> anova(fm2, fm1)

Data: PBIB
Models:
fm2: response ~ 1 + (1 | Block)
fm1: response ~ Treatment + (1 | Block)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm2 2 50.178 54.367 -23.089
fm1 16 54.828 88.338 -11.414 23.35 14 0.05481 .

(The first is more accurate than the second.) However, both p-values are anti-conservative relative to an empirical p-value obtained from simulation of data sets under the null hypothesis.

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Sep 15 00:41:54 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 14 Sep 2006 - 15:30:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.