[R] lme X lmer results

From: John Maindonald <john.maindonald_at_anu.edu.au>
Date: Fri 30 Dec 2005 - 15:47:01 EST

Surely there is a correct denominator degrees of freedom if the design is balanced, as Ronaldo's design seems to be. Assuming that he has specified the design correctly to lme() and that lme() is getting the df right, the difference is between 2 df and 878 df. If the t-statistic for the
second level of Xvar had been 3.0 rather than 1.1, the difference would be between a t-statistic equal to 0.095 and 1e-6. In a design where there are 10 observations on each experimental unit, and all comparisons are at the level of experimental units or above, df for
all comparisons will be inflated by a factor of at least 9.

Rather than giving df that for the comparison(s) of interest may be highly inflated, I'd prefer to give no degrees of freedom at all, & to encourage users to work out df for themselves if at all possible. If they are not able to do this, then mcmcsamp() is a good alternative, and may be the way to go in any case. This has the further advantage of allowing assessments in cases where the relevant distribution is hard to get at. I'd think a warning in order that the df are upper bounds,
and may be grossly inflated.

Incidentally, does mcmcsamp() do its calculations pretty well independently of the lmer results?

John Maindonald.

On 29 Dec 2005, at 10:00 PM, r-help-request@stat.math.ethz.ch wrote:

> From: Douglas Bates <dmbates@gmail.com>
> Date: 29 December 2005 5:59:07 AM
> To: "Ronaldo Reis-Jr." <chrysopa@gmail.com>
> Cc: R-Help <r-help@stat.math.ethz.ch>
> Subject: Re: [R] lme X lmer results
>
>
> On 12/26/05, Ronaldo Reis-Jr. <chrysopa@gmail.com> wrote:
>> Hi,
>>
>> this is not a new doubt, but is a doubt that I cant find a good
>> response.
>>
>> Look this output:
>>
>>> m.lme <- lme(Yvar~Xvar,random=~1|Plot1/Plot2/Plot3)
>>
>>> anova(m.lme)
>> numDF denDF F-value p-value
>> (Intercept) 1 860 210.2457 <.0001
>> Xvar 1 2 1.2352 0.3821
>>> summary(m.lme)
>> Linear mixed-effects model fit by REML
>> Data: NULL
>> AIC BIC logLik
>> 5416.59 5445.256 -2702.295
>>
>> Random effects:
>> Formula: ~1 | Plot1
>> (Intercept)
>> StdDev: 0.000745924
>>
>> Formula: ~1 | Plot2 %in% Plot1
>> (Intercept)
>> StdDev: 0.000158718
>>
>> Formula: ~1 | Plot3 %in% Plot2 %in% Plot1
>> (Intercept) Residual
>> StdDev: 0.000196583 5.216954
>>
>> Fixed effects: Yvar ~ Xvar
>> Value Std.Error DF t-value p-value
>> (Intercept) 2.3545454 0.2487091 860 9.467066 0.0000
>> XvarFactor2 0.3909091 0.3517278 2 1.111397 0.3821
>>
>> Number of Observations: 880
>> Number of Groups:
>> Plot1 Plot2 %in% Plot1
>> 4 8
>> Plot3 %in% Plot2 %in% Plot1
>> 20
>>
>> This is the correct result, de correct denDF for Xvar.
>>
>> I make this using lmer.
>>
>>> m.lmer <- lmer(Yvar~Xvar+(1|Plot1)+(1|Plot1:Plot2)+(1|Plot3))
>>> anova(m.lmer)
>> Analysis of Variance Table
>> Df Sum Sq Mean Sq Denom F value Pr(>F)
>> Xvar 1 33.62 33.62 878.00 1.2352 0.2667
>>> summary(m.lmer)
>> Linear mixed-effects model fit by REML
>> Formula: Yvar ~ Xvar + (1 | Plot1) + (1 | Plot1:Plot2) + (1 | Plot3)
>> AIC BIC logLik MLdeviance REMLdeviance
>> 5416.59 5445.27 -2702.295 5402.698 5404.59
>> Random effects:
>> Groups Name Variance Std.Dev.
>> Plot3 (Intercept) 1.3608e-08 0.00011665
>> Plot1:Plot2 (Intercept) 1.3608e-08 0.00011665
>> Plot1 (Intercept) 1.3608e-08 0.00011665
>> Residual 2.7217e+01 5.21695390
>> # of obs: 880, groups: Plot3, 20; Plot1:Plot2, 8; Plot1, 4
>>
>> Fixed effects:
>> Estimate Std. Error DF t value Pr(>|t|)
>> (Intercept) 2.35455 0.24871 878 9.4671 <2e-16 ***
>> XvarFactor2 0.39091 0.35173 878 1.1114 0.2667
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> Look the wrong P value, I know that it is wrong because the DF
>> used. But, In
>> this case, the result is not correct. Dont have any difference of
>> the result
>> using random effects with lmer and using a simple analyses with lm.
>
> You are assuming that there is a correct value of the denominator
> degrees of freedom. I don't believe there is. The statistic that is
> quoted there doesn't have exactly an F distribution so there is no
> correct degrees of freedom.
>
> One thing you can do with lmer is to form a Markov Chain Monte Carlo
> sample from the posterior distribution of the parameters so you can
> check to see whether the value of zero is in the middle of the
> distribution of XvarFactor2 or not.
>
> It would be possible for me to recreate in lmer the rules used in lme
> for calculating denominator degrees of freedom associated with terms
> of the random effects. However, the class of models fit by lmer is
> larger than the class of models fit by lme (at least as far as the
> structure of the random-effects terms goes). In particular lmer
> allows for random effects associated with crossed or partially crossed
> grouping factors and the rules for denominator degrees of freedom in
> lme only apply cleanly to nested grouping factors. I would prefer to
> have a set of rules that would apply to the general case.
>
> Right now I would prefer to devote my time to other aspects of lmer -
> in particular I am still working on code for generalized linear mixed
> models using a supernodal Cholesky factorization. I am willing to put
> this aside and code up the rules for denominator degrees of freedom
> with nested grouping factors BUT first I want someone to show me an
> example demonstrating that there really is a problem. The example
> must show that the p-value calculated in the anova table or the
> parameter estimates table for lmer is seriously wrong compared to an
> empirical p-value - obtained from simulation under the null
> distribution or through MCMC sampling or something like that. Saying
> that "Software XYZ says there are n denominator d.f. and lmer says
> there are m" does NOT count as an example. I will readily concede
> that the denominator degrees of freedom reported by lmer are wrong but
> so are the degrees of freedom reported by Software XYZ because there
> is no right answer (in general - in a few simple balanced designs
> there may be a right answer).
>
>>
>>> m.lm <- lm(Yvar~Xvar)
>>>
>>> anova(m.lm)
>> Analysis of Variance Table
>>
>> Response: Nadultos
>> Df Sum Sq Mean Sq F value Pr(>F)
>> Xvar 1 33.6 33.6 1.2352 0.2667
>> Residuals 878 23896.2 27.2
>>>
>>> summary(m.lm)
>>
>> Call:
>> lm(formula = Yvar ~ Xvar)
>>
>> Residuals:
>> Min 1Q Median 3Q Max
>> -2.7455 -2.3545 -1.7455 0.2545 69.6455
>>
>> Coefficients:
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 2.3545 0.2487 9.467 <2e-16 ***
>> XvarFactor2 0.3909 0.3517 1.111 0.267
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> Residual standard error: 5.217 on 878 degrees of freedom
>> Multiple R-Squared: 0.001405, Adjusted R-squared: 0.0002675
>> F-statistic: 1.235 on 1 and 878 DF, p-value: 0.2667
>>
>> I read the rnews about this use of the full DF in lmer, but I dont
>> undestand
>> this use with a gaussian error, I undestand this with glm data.
>>
>> I need more explanations, please.
>>
>> Thanks
>> Ronaldo



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 30 15:55:10 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:44 EST