Re: [R] two-way categorical anova post-hoc data extraction

From: hadley wickham <h.wickham_at_gmail.com>
Date: Wed, 12 Dec 2007 13:59:47 -0600

> Hi list,
>
> I have a question regarding post-hoc extraction of data from a two-way categorical anova.
>
> I have a categorical anova of this form:
>
> width ~ steepness + patchiness (4 steepness levels, 4 patchiness levels)
>
> This simple setup answers if for the widths I collected across different levels of steepness and patchiness significant differences can be found. Is there a way to look at these differences in detail. Lets say that the steepness parameter is significant, then I would like to know between which levels they are significant or if there are levels where this isn't the case.
>
> It's a basic question but my R knowledge has faded somewhat...

I've never found this particularly easy to do. For a recent client I wrote:

library(effects)
library(multcomp)
library(multcompView)


effectsum <- function(model, effect) {
  effects <- as.data.frame(all.effects(model)[[effect]])

  mcp <- list("Tukey")
  names(mcp) <- effect
  class(mcp) <- "mcp"

  glht_sum <- summary(glht(model, linfct = mcp))   p <- as.vector(glht_sum$test$pvalues)
  names(p) <- gsub(" ", "", names(glht_sum$test$tstat))

  groups <- multcompLetters(p)
  effects[, 1] <- factor(effects[, 1])
  effects$group <- groups$Letters[as.character(effects[, 1])]

  effects
}

which allows you to do:

mtcars$cyl <- factor(mtcars$cyl)
simple <- lm(mpg ~ wt + cyl, data=mtcars) effects <- effectsum(simple, "cyl")

library(ggplot2)
qplot(cyl, fit, data=effects, min = lower, max = upper, geom="pointrange", ylab="Mean effect") + geom_text(aes(label = group, y = min(lower) - diff(range(lower)) * 0.07))

ggplot(mtcars, aes(x = cyl, y = mpg)) + geom_crossbar(aes(min=lower, max=upper, y=fit), data=effects, width=0.2) + geom_point(aes(colour=wt))

This gives lsmeans (aka population marginal means) with their standard errors, and groups generated from all pairwise comparisons adjusted for multiple comparisons.

I would love to improve this code: to deal with all factors in a model automatically. Additionally, all.effects and multcompLetters are rather fragile with respect to level names - if you get an error try removing any non-alphanumeric characters,

Hadley

-- 
http://had.co.nz/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 12 Dec 2007 - 20:04:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 12 Dec 2007 - 20:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.