From: Douglas Bates <bates_at_stat.wisc.edu>

Date: Mon 18 Sep 2006 - 18:09:29 GMT

sqdist <- colSums(std * std)

sum(sqdist[-1] > sqdist[1])/nrow(samp) }

*> AdgS1 <- mcmcsamp(fm1Adg, 50000)
*

> library(coda)

*> HPDinterval(AdgS1)
*

[1] 0.95

*> mcmcpvalue(as.matrix(AdgS1)[, 6:8])
*

[1] 0.46876

Treatment20 -0.102 -0.053 0.502

Treatment30 -0.338 0.224 0.454 0.475

*> AdgS2 <- mcmcsamp(fm2Adg, 50000)
*

*> HPDinterval(AdgS2)
*

[1] 0.95

*> mcmcpvalue(as.matrix(AdgS2[,3:5]))
*

[1] 0.00026

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Sep 19 05:14:55 2006

Date: Mon 18 Sep 2006 - 18:09:29 GMT

On 9/12/06, Douglas Bates <bates@stat.wisc.edu> wrote:

> On 9/11/06, Manuel Morales <Manuel.A.Morales@williams.edu> wrote:

[snip]

> > Am I right that the MCMC sample can not be used, however, to evaluate

*> > the significance of parameter groups. For example, to assess the
**> > significance of a three-level factor? Are there better alternatives than
**> > simply adjusting the CI for the number of factor levels
**> > (1-alpha/levels).
**>
**> Hmm - I'm not sure what confidence interval and what number of levels
**> you mean there so I can't comment on that method.
**>
**> Suppose we go back to Spencer's example and consider if there is a
**> signficant effect for the Nozzle factor. That is equivalent to the
**> hypothesis H_0: beta_2 = beta_3 = 0 versus the general alternative. A
**> "p-value" could be formulated from an MCMC sample if we assume that
**> the marginal distribution of the parameter estimates for beta_2 and
**> beta_3 has roughly elliptical contours and you can evaluate that by,
**> say, examining a hexbin plot of the values in the MCMC sample. One
**> could take the ellipses as defined by the standard errors and
**> estimated correlation or, probably better, by the observed standard
**> deviations and correlations in the MCMC sample. Then determine the
**> proportion of (beta_2, beta_3) pairs in the sample that fall outside
**> the ellipse centered at the estimates and with that eccentricity and
**> scaling factors that passes through (0,0). That would be an empirical
**> p-value for the test.
**>
**> I would recommend calculating this for a couple of samples to check on
**> the reproducibility.
*

There was some follow-up on this thread, including some code and results that I find encouraging. I didn't notice that the R-help list had been dropped off the cc: in later exchanges. I enclose my contribution to the conversation so that others on the list will get to see it.

- exerpt from previous private message ----

As soon as I described the idea I knew that someone would ask for a function to perform it. Here's one

mcmcpvalue <- function(samp)

{

## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a ## general multivariate distribution with elliptical contours.

## differences from the mean standardized by the observed
## variance-covariance factor

std <- backsolve(chol(var(samp)),

cbind(0, t(samp)) - colMeans(samp), transpose = TRUE)

sqdist <- colSums(std * std)

sum(sqdist[-1] > sqdist[1])/nrow(samp) }

At least I think I have the standardization by the Cholesky factor of the observed variance-covariance matrix correct. However I always manage to confuse myself on that calculation so please let me know if I have it wrong.

As an example, consider a model fit to the AvgDailyGain data from the
SASmixed package.

*> library(nlme)
**> data(AvgDailyGain, package = "SASmixed")
**> summary(fm1Adg <- lme(adg ~ InitWt*Treatment - 1, AvgDailyGain, random = ~1|Block))
*

Linear mixed-effects model fit by REML

Data: AvgDailyGain

AIC BIC logLik

85.32685 97.10739 -32.66342

Random effects:

Formula: ~1 | Block

(Intercept) Residual

StdDev: 0.5092266 0.2223268

Fixed effects: adg ~ InitWt * Treatment - 1

Value Std.Error DF t-value p-value InitWt 0.0022937 0.0017473 17 1.3126947 0.2067 Treatment0 0.4391370 0.7110881 17 0.6175564 0.5451 Treatment10 1.4261187 0.6375458 17 2.2368880 0.0390 Treatment20 0.4796285 0.5488867 17 0.8738206 0.3944 Treatment30 0.2001071 0.7751989 17 0.2581365 0.7994 InitWt:Treatment10 -0.0012108 0.0023326 17 -0.5190774 0.6104InitWt:Treatment20 0.0010720 0.0021737 17 0.4931507 0.6282 InitWt:Treatment30 0.0021543 0.0027863 17 0.7731996 0.4500 Correlation:

InitWt Trtmn0 Trtm10 Trtm20 Trtm30 IW:T10 IW:T20 Treatment0 -0.961 Treatment10 0.034 0.039 Treatment20 0.003 0.080 0.334 Treatment30 0.050 0.011 0.097 0.043 InitWt:Treatment10 -0.772 0.742 -0.631 -0.164 -0.059InitWt:Treatment20 -0.806 0.775 -0.180 -0.555 -0.019 0.724 InitWt:Treatment30 -0.666 0.640 -0.046 0.024 -0.754 0.529 0.520

Standardized Within-Group Residuals:

Min Q1 Med Q3 Max -1.82903364 -0.44913967 -0.03023488 0.44738506 1.59877700

Number of Observations: 32

Number of Groups: 8

*> anova(fm1Adg)
*

numDF denDF F-value p-value InitWt 1 17 91.68230 <.0001 Treatment 4 17 8.81312 0.0005 InitWt:Treatment 3 17 0.93118 0.4471

Fitting the same model in lmer then generating an MCMC sample and testing for the three interaction coefficients being zero would look like

*> data(AvgDailyGain, package = "SASmixed")
**> (fm1Adg <- lmer(adg ~ InitWt*Treatment - 1 + (1|Block), AvgDailyGain))
*

Linear mixed-effects model fit by REML

Formula: adg ~ InitWt * Treatment - 1 + (1 | Block)
Data: AvgDailyGain

AIC BIC logLik MLdeviance REMLdeviance
83.33 96.52 -32.66 10.10 65.33
Random effects:

Groups Name Variance Std.Dev.

Block (Intercept) 0.25930 0.50922

Residual 0.04943 0.22233

number of obs: 32, groups: Block, 8

Fixed effects:

Estimate Std. Error t value InitWt 0.002294 0.001747 1.3127 Treatment0 0.439128 0.711092 0.6175 Treatment10 1.426113 0.637549 2.2369 Treatment20 0.479621 0.548889 0.8738 Treatment30 0.200115 0.775204 0.2581 InitWt:Treatment10 -0.001211 0.002333 -0.5191InitWt:Treatment20 0.001072 0.002174 0.4931 InitWt:Treatment30 0.002154 0.002786 0.7732

Correlation of Fixed Effects:

InitWt Trtmn0 Trtm10 Trtm20 Trtm30 IW:T10 IW:T20

Treatment0 -0.961 Treatment10 0.034 0.039 Treatment20 0.003 0.080 0.334 Treatment30 0.050 0.011 0.097 0.043 IntWt:Trt10 -0.772 0.742 -0.631 -0.164 -0.059 IntWt:Trt20 -0.806 0.775 -0.180 -0.555 -0.019 0.724 IntWt:Trt30 -0.666 0.640 -0.046 0.024 -0.754 0.529 0.520

> library(coda)

lower upper InitWt -0.001402986 0.006164517 Treatment0 -1.144313821 1.925351624 Treatment10 0.047128188 2.776309715 Treatment20 -0.761106539 1.602588258 Treatment30 -1.440932884 1.887536852 InitWt:Treatment10 -0.006309270 0.003569772 InitWt:Treatment20 -0.003578127 0.005614077 InitWt:Treatment30 -0.004003873 0.008055591 log(sigma^2) -3.617259562 -2.198161379 log(Blck.(In)) -2.438121552 0.002976392 deviance 12.769059400 32.841699073attr(,"Probability")

[1] 0.95

[1] 0.46876

If we drop the interaction term and consider the treatment term the test becomes

*> (fm2Adg <- lmer(adg ~ InitWt + Treatment + (1|Block), AvgDailyGain))
*

Linear mixed-effects model fit by REML

Formula: adg ~ InitWt + Treatment + (1 | Block)
Data: AvgDailyGain

AIC BIC logLik MLdeviance REMLdeviance
48.34 57.13 -18.17 13.62 36.34
Random effects:

Groups Name Variance Std.Dev.

Block (Intercept) 0.24084 0.49076

Residual 0.05008 0.22379

number of obs: 32, groups: Block, 8

Fixed effects:

Estimate Std. Error t value (Intercept) 0.2490338 0.3776318 0.659 InitWt 0.0027797 0.0008334 3.336 Treatment10 0.4835075 0.1128530 4.284 Treatment20 0.4639445 0.1120505 4.140 Treatment30 0.5520737 0.1148132 4.808 Correlation of Fixed Effects: (Intr) InitWt Trtm10 Trtm20 InitWt -0.863 Treatment10 -0.035 -0.130

Treatment20 -0.102 -0.053 0.502

Treatment30 -0.338 0.224 0.454 0.475

lower upper (Intercept) -0.579312545 1.044946476 InitWt 0.001114375 0.004616652 Treatment10 0.246254015 0.714185069 Treatment20 0.220236717 0.686356392 Treatment30 0.316078702 0.797956531 log(sigma^2) -3.553700311 -2.280238850 log(Blck.(In)) -2.448486301 -0.072657242 deviance 14.687593543 29.699825900attr(,"Probability")

[1] 0.95

[1] 0.00026

so these p-values seem to be in line with the results from the analysis of variance p-values using one of the formula for calculation of a residual degrees of freedom.

Related to the other question of the use of a likelihood ratio test for the fixed effects, the p-values for the likelihood ratio tests seem quite different

*> anova(fm2Adg, fm1Adg)
*

Data: AvgDailyGain

Models: fm2Adg: adg ~ InitWt + Treatment + (1 | Block) fm1Adg: adg ~ InitWt * Treatment - 1 + (1 | Block) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm2Adg 6 25.623 34.417 -6.812 fm1Adg 9 28.098 41.290 -5.049 3.5249 3 0.3176

*> fm3Adg <- lmer(adg ~ InitWt + (1|Block), AvgDailyGain)
**> anova(fm2Adg, fm3Adg)
*

Data: AvgDailyGain

Models: fm3Adg: adg ~ InitWt + (1 | Block) fm2Adg: adg ~ InitWt + Treatment + (1 | Block) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm3Adg 3 41.913 46.310 -17.957 fm2Adg 6 25.623 34.417 -6.812 22.29 3 5.677e-05 ***

and it is not just a matter of the evaluation of the log-likelihood

*> fm2AdgML <- update(fm2Adg, method = "ML")
**> fm3AdgML <- update(fm3Adg, method = "ML")
**> anova(fm3AdgML, fm2AdgML)
*

Data: AvgDailyGain

Models:

fm3AdgML: adg ~ InitWt + (1 | Block)

fm2AdgML: adg ~ InitWt + Treatment + (1 | Block)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
fm3AdgML 3 41.882 46.279 -17.941

fm2AdgML 6 25.617 34.412 -6.809 22.265 3 5.746e-05 ***

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Sep 19 05:14:55 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Mon 18 Sep 2006 - 19:30:05 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*