Re: [R] Conservative "ANOVA tables" in lmer

From: Martin Henry H. Stevens <stevenmh_at_muohio.edu>
Date: Sun 10 Sep 2006 - 10:54:30 GMT

Hi Spencer,
I would like to make sure I understand Spencer's question and doubt's, below.
On Sep 9, 2006, at 11:36 PM, Spencer Graves wrote:

> Peter's example and Doug's "different test" reply sent me
> Scheffé's discussion of the balanced and replicated mixed-effect 2-
> away layout. As I note below, the obvious F test for the fixed
> effect does not appear to be likelihood ratio for anything.
> Douglas Bates wrote:

>> On 9/7/06, Douglas Bates <bates@stat.wisc.edu> wrote:
>>
>>> On 07 Sep 2006 17:20:29 +0200, Peter Dalgaard  
>>> <p.dalgaard@biostat.ku.dk> wrote:
>>>
>>>> Martin Maechler <maechler@stat.math.ethz.ch> writes:
>>>>
>>>>
>>>>>>>>>> "DB" == Douglas Bates <bates@stat.wisc.edu>
>>>>>>>>>>     on Thu, 7 Sep 2006 07:59:58 -0500 writes:
>>>>>>>>>>
>>>>>     DB> Thanks for your summary, Hank.
>>>>>     DB> On 9/7/06, Martin Henry H. Stevens  
>>>>> <hstevens@muohio.edu> wrote:
>>>>>     >> Dear lmer-ers,
>>>>>     >> My thanks for all of you who are sharing your trials and  
>>>>> tribulations
>>>>>     >> publicly.
>>>>>
>>>>>     >> I was hoping to elicit some feedback on my thoughts on  
>>>>> denominator
>>>>>     >> degrees of freedom for F ratios in mixed models. These  
>>>>> thoughts and
>>>>>     >> practices result from my reading of previous postings by  
>>>>> Doug Bates
>>>>>     >> and others.
>>>>>
>>>>>     >> - I start by assuming that the appropriate denominator  
>>>>> degrees lies
>>>>>     >> between n - p and and n - q, where n=number of  
>>>>> observations, p=number
>>>>>     >> of fixed effects (rank of model matrix X), and q=rank of  
>>>>> Z:X.
>>>>>
>>>>>     DB> I agree with this but the opinion is by no means  
>>>>> universal.  Initially
>>>>>     DB> I misread the statement because I usually write the  
>>>>> number of columns
>>>>>     DB> of Z as q.
>>>>>
>>>>>     DB> It is not easy to assess rank of Z:X numerically.  In  
>>>>> many cases one
>>>>>     DB> can reason what it should be from the form of the model  
>>>>> but a general
>>>>>     DB> procedure to assess the rank of a matrix, especially a  
>>>>> sparse matrix,
>>>>>     DB> is difficult.
>>>>>
>>>>>     DB> An alternative which can be easily calculated is n - t  
>>>>> where t is the
>>>>>     DB> trace of the 'hat matrix'.  The function 'hatTrace'  
>>>>> applied to a
>>>>>     DB> fitted lmer model evaluates this trace (conditional on  
>>>>> the estimates
>>>>>     DB> of the relative variances of the random effects).
>>>>>
>>>>>     >> - I then conclude that good estimates of P values on the  
>>>>> F ratios lie
>>>>>     >>   between 1 - pf(F.ratio, numDF, n-p) and 1 - pf 
>>>>> (F.ratio, numDF, n-q).
>>>>>     >>   -- I further surmise that the latter of these (1 - pf 
>>>>> (F.ratio, numDF,
>>>>>     >>   n-q)) is the more conservative estimate.
>>>>>
>>>>> This assumes that the true distribution (under H0) of that "F  
>>>>> ratio"
>>>>> *is*  F_{n1,n2}  for some (possibly non-integer)  n1 and n2.
>>>>> But AFAIU, this is only approximately true at best, and AFAIU,
>>>>> the quality of this approximation has only been investigated
>>>>> empirically for some situations.
>>>>> Hence, even your conservative estimate of the P value could be
>>>>> wrong (I mean "wrong on the wrong side" instead of just
>>>>> "conservatively wrong").  Consequently, such a P-value is only
>>>>> ``approximately conservative'' ...
>>>>> I agree howevert that in some situations, it might be a very
>>>>> useful "descriptive statistic" about the fitted model.
>>>>>
>>>> I'm very wary of ANY attempt at guesswork in these matters.
>>>>
>>>> I may be understanding the post wrongly, but consider this case:  
>>>> Y_ij
>>>> = mu + z_i + eps_ij, i = 1..3, j=1..100
>>>>
>>>> I get rank(X)=1, rank(X:Z)=3,  n=300
>>>>
>>>> It is well known that the test for mu=0 in this case is obtained by
>>>> reducing data to group means, xbar_i, and then do a one-sample t  
>>>> test,
>>>> the square of which is F(1, 2), but it seems to be suggested that
>>>> F(1, 297) is a conservative test???!
>>>>
>>> It's a different test, isn't it?  Your test is based upon the  
>>> between
>>> group sum of squares with 2 df.  I am proposing to use the within
>>> group sum of squares or its generalization.
>>>
>>
>> On closer examination I see that you are indeed correct.  I have  
>> heard
>> that "well-known" result many times and finally sat down to prove it
>> to myself.  For a balanced design the standard error of the intercept
>> using the REML estimates is the same as the standard error of the  
>> mean
>> calculated from the group means.
>>
>>
>>> data(Rail, package = 'nlme')
>>> library(lme4)
>>> summary(fm1 <- lmer(travel ~ 1 + (1|Rail), Rail))
>>>
>> Linear mixed-effects model fit by REML
>> Formula: travel ~ 1 + (1 | Rail)
>>    Data: Rail
>>    AIC   BIC logLik MLdeviance REMLdeviance
>>  126.2 128.0 -61.09      128.6        122.2
>> Random effects:
>>  Groups   Name        Variance Std.Dev.
>>  Rail     (Intercept) 615.286  24.8050
>>  Residual              16.167   4.0208
>> number of obs: 18, groups: Rail, 6
>>
>> Fixed effects:
>>             Estimate Std. Error t value
>> (Intercept)    66.50      10.17   6.538
>>
>>> mns <- with(Rail, tapply(travel, Rail, mean)) # group means
>>> sd(mns)/sqrt(length(mns))  # standard error matches that from lmer
>>>
>> [1] 10.17104
>>
>>> t.test(mns)
>>>
>>
>> 	One Sample t-test
>>
>> data:  mns
>> t = 6.5382, df = 5, p-value = 0.001253
>> alternative hypothesis: true mean is not equal to 0
>> 95 percent confidence interval:
>>  40.35452 92.64548
>> sample estimates:
>> mean of x
>>      66.5
>>
>>
>>> ctab <- summary(fm1)@coefs  # coefficient table
>>> ctab[,1] + c(-1,1) * qt(0.975, 15) * ctab[,2] # 95% conf. int.
>>>
>> [1] 44.82139 88.17861
>>
>>> ## interval using df = # of obs - rank of [Z:X] is too narrow
>>>
>>
>> So my proposal of using either the trace of the hat matrix or the  
>> rank
>> of the combined model matrices as the degrees of freedom for the  
>> model
>> is not conservative.
>>
>> However, look at the following
>>
>>
>>> set.seed(123454321)  # for reproducibility
>>> sm1 <- mcmcsamp(fm1, 50000)
>>> library(coda)
>>> HPDinterval(sm1)
>>>
>>                     lower      upper
>> (Intercept)     40.470663  92.608514
>> log(sigma^2)     2.060179   3.716326
>> log(Rail.(In))   5.371858   8.056897
>> deviance       128.567329 137.487455
>> attr(,"Probability")
>> [1] 0.95
>>
>> The HPD interval calculated from a MCMC sample reproduce the interval
>> from the group means almost exactly.  This makes sense in that the
>> MCMC sample takes into account the variation in the estimates of the
>> variance components, just as defining intervals based on the  
>> Student's
>> t does.
>>
>> So for this case where the distribution of the estimate of the mean
>> has a known distribution the correct degrees of freedom and the MCMC
>> sample produce similar answers.
>>
>> This gives me more confidence in the results from the MCMC sample in
>> general cases.
>>
>> The problem I have with trying to work out what the degrees of  
>> freedom
>> "should be" is that the rules seem rather arbitrary.  For example,  
>> the
>> "between-within" rule used in SAS PROC Mixed is popular (many accept
>> it as the "correct" answer) but it assumes that the degrees of  
>> freedom
>> associated with a random effect grouped by a factor with k levels is
>> always k - 1.  This value is used even when there is a random
>> intercept and a random slope for each group.  In fact you could have
>> an arbitrary number of random effects for each level of the grouping
>> factor and it would still apparently only cost you k - 1 degrees of
>> freedom.  That doesn't make sense to me.
>>
>> Anyway, I thank you for pointing out the errors of my ways Peter.
>>

> For the traditional, balanced, replicated, 2-way mixed-effects
> analysis, Scheffé (1959, Table 8.1.1, p. 269) gives the expected
> mean squares for a two-way layout with "I" levels of a fixed effect
> A, "J" levels of a random effect B, and "K" replicates, as follows:
> EMS(A: fixed) = var(e) + K*var(A:B) + J*K*MeanSquareA
> EMS(B: random) = var(e) + I*K*var(B)
> EMS(A:B; random)=var(e)+K*var(A:B)
> EMSE = var(e).
> In this case, the "obvious" test for A is MS(A: fixed) / MS
> (A:B, random), because this gives us a standard F statistic to test
> MeanSquareA = 0. However, it doesn't make sense to me to test A
> without simultaneously assuming var(A:B) = 0.

Does one want to test whether var(A:B)=0 because the F-test assumes it? That is, that as var(A:B) increases, the var ratio MS{A:fixed)/MS (A:B, random) decreases, artifactually reducing the "significance" of J:K:MeanSquareA?

> The same argument applies to Peter's "simpler" case discussed
> above: With "Y_ij = mu + z_i + eps_ij", it only rarely makes sense
> to test mu=0 while assuming var(z) != 0. In the balanced 2-way,
> mixed-effects analysis, the Neyman-Pearson thing to do, I would
> think, would be to test simultaneously MeanSquareA = 0 with var
> (A:B) = 0. In lmer, I might write this as follows:
> anova(lmer(y~A+(A|B)), lmer(y~1+(1|B)).
> However, this does NOT match the standard analysis associated
> with this design, does it? To check this, I considered problem 8.1
> in Scheffé (p. 289), which compares 3 different nozzles (fixed
> effect) tested by 5 different operators (random effect). The data
> are as follows:
> y <- c(6,6,-15, 26,12,5, 11,4,4, 21,14,7, 25,18,25,
> 13,6,13, 4,4,11, 17,10,17, -5,2,-5, 15,8,1,
> 10,10,-11, -35,0,-14, 11,-10,-17, 12,-2,-16, -4,10,24)

>

> Nozzle <- data.frame(Nozzle=rep(LETTERS[1:3], e=15),
> Operator=rep(letters[1:5], e=3), flowRate=y)
>
> The traditional analysis can be obtained from anova(lm
> (flowRate~Nozzle*Operator, ...)), but comparing MeanSq.Nozzle to
> MeanSq.Nozzle:Operator rather than MeanSquareResidual, as follows:
> > fitAB0 <- lm(flowRate~Nozzle*Operator, data=Nozzle)
> > (aov.AB0 <- anova(fitAB0))
> Analysis of Variance Table
>

> Response: flowRate
> Df Sum Sq Mean Sq F value Pr(>F)
> Nozzle 2 1426.98 713.49 7.0456 0.003101 **
> Operator 4 798.80 199.70 1.9720 0.124304
> Nozzle:Operator 8 1821.47 227.68 2.2484 0.051640 .
> Residuals 30 3038.00 101.27 ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>

> Scheffé must have computed the following:
> > (F.A.Scheffe <- aov.AB0[1, "Mean Sq"]/aov.AB0[3, "Mean Sq"])
> [1] 3.133690
> > pf(F.A.Scheffe, 1, 2, lower.tail=FALSE)
> [1] 0.2187083
>

> However, I think I prefer the likelihood ratio answer to this,
> because I think it is better to have an approximate solution to the
> exact problem than an exact solution to the approximate problem.
> [I got this from someone else like Tukey, but I don't have a
> citation.]) I can get this likelihood ratio answer from either lme
> or lmer.
> When I tried to fit this model with 'mle'; it didn't want to
> converge:
> library(nlme)
> fitAB. <- lme(flowRate~Nozzle, random=~Nozzle|Operator,
> data=Nozzle, method="ML")
> Error in lme.formula(flowRate ~ Nozzle, random = ~Nozzle |
> Operator, data = Nozzle, :
> nlminb problem, convergence error code = 1; message = iteration
> limit reached without convergence (9)
>

> After several false starts, I got the following to work:
> fitAB. <- lme(flowRate~Nozzle, random=~Nozzle|Operator,
> data=Nozzle, method="ML",
> control=lmeControl(opt="optim"))
>
> > anova(fitAB., fitB.)
> Model df AIC BIC logLik Test L.Ratio p-value
> fitAB. 1 10 361.9022 379.9688 -170.9511
> fitB. 2 3 361.3637 366.7837 -177.6819 1 vs 2 13.46153 0.0616
> >
> I got essentially the same answer from lmer (without the
> convergence problem, but quitting R in between:
> > fitAB <- lmer(flowRate~Nozzle+(Nozzle|Operator),
> + data=Nozzle, method="ML")
> > fitB <- lmer(flowRate~1+(1|Operator), data=Nozzle,
> + method="ML")
> > anova(fitAB, fitB)
> Data: Nozzle
> Models:
> fitB: flowRate ~ 1 + (1 | Operator)
> fitAB: flowRate ~ Nozzle + (Nozzle | Operator)
> Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fitB 2
> 359.36 362.98 -177.68 fitAB 9 359.88
> 376.14 -170.94 13.479 7 0.06126 .
> Comments? Spencer Graves
> p.s. For the lme fit: > sessionInfo()
> Version 2.3.1 Patched (2006-08-13 r38872)
> i386-pc-mingw32
>

> attached base packages:
> [1] "methods" "stats" "graphics" "grDevices" "utils"
> "datasets"
> [7] "base"
> other attached packages:
> nlme
> "3.1-75"
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting- 
>> guide.html

>> and provide commented, minimal, self-contained, reproducible code. >>

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon Sep 11 17:57:24 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 11 Sep 2006 - 11:30:04 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.