# Re: [R] Conservative "ANOVA tables" in lmer

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Thu 07 Sep 2006 - 15:20:29 GMT

> >>>>> "DB" == Douglas Bates <bates@stat.wisc.edu>
> >>>>> on Thu, 7 Sep 2006 07:59:58 -0500 writes:
>
> DB> Thanks for your summary, Hank.
> DB> On 9/7/06, Martin Henry H. Stevens <hstevens@muohio.edu> wrote:
> >> Dear lmer-ers,
> >> My thanks for all of you who are sharing your trials and tribulations
> >> publicly.
>
> >> I was hoping to elicit some feedback on my thoughts on denominator
> >> degrees of freedom for F ratios in mixed models. These thoughts and
> >> practices result from my reading of previous postings by Doug Bates
> >> and others.
>
> >> - I start by assuming that the appropriate denominator degrees lies
> >> between n - p and and n - q, where n=number of observations, p=number
> >> of fixed effects (rank of model matrix X), and q=rank of Z:X.
>
> DB> I agree with this but the opinion is by no means universal. Initially
> DB> I misread the statement because I usually write the number of columns
> DB> of Z as q.
>
> DB> It is not easy to assess rank of Z:X numerically. In many cases one
> DB> can reason what it should be from the form of the model but a general
> DB> procedure to assess the rank of a matrix, especially a sparse matrix,
> DB> is difficult.
>
> DB> An alternative which can be easily calculated is n - t where t is the
> DB> trace of the 'hat matrix'. The function 'hatTrace' applied to a
> DB> fitted lmer model evaluates this trace (conditional on the estimates
> DB> of the relative variances of the random effects).
>
> >> - I then conclude that good estimates of P values on the F ratios lie
> >> between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, n-q).
> >> -- I further surmise that the latter of these (1 - pf(F.ratio, numDF,
> >> n-q)) is the more conservative estimate.
>
> This assumes that the true distribution (under H0) of that "F ratio"
> *is* F_{n1,n2} for some (possibly non-integer) n1 and n2.
> But AFAIU, this is only approximately true at best, and AFAIU,
> the quality of this approximation has only been investigated
> empirically for some situations.
> Hence, even your conservative estimate of the P value could be
> wrong (I mean "wrong on the wrong side" instead of just
> "conservatively wrong"). Consequently, such a P-value is only
> ``approximately conservative'' ...
> I agree howevert that in some situations, it might be a very
> useful "descriptive statistic" about the fitted model.

I'm very wary of ANY attempt at guesswork in these matters.

I may be understanding the post wrongly, but consider this case: Y_ij = mu + z_i + eps_ij, i = 1..3, j=1..100

I get rank(X)=1, rank(X:Z)=3, n=300

It is well known that the test for mu=0 in this case is obtained by reducing data to group means, xbar_i, and then do a one-sample t test, the square of which is F(1, 2), but it seems to be suggested that F(1, 297) is a conservative test???!

```--
O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.
```
Received on Fri Sep 08 01:26:32 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 07 Sep 2006 - 17:46:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.