# [R] Effect size of comparison of two levels of a factor in multiple linear regression

From: Christoph Mathys <cmathys_at_bidmc.harvard.edu>
Date: Sun, 3 Feb 2008 16:09:50 +0100

I have a linear model of the kind

outcome ~ treatment + covariate

where 'treatment' is a factor with three levels ("0", "1", and "2"), and the covariate is continuous. Treatments "1" and "2" both have regression coefficients significantly different from 0 when using treatment contrasts with treatment "0" as the baseline. I would now like to determine effect sizes (akin to Cohen's d in a two-sample comparison) for the comparison to baseline of treatments "1" and "2". I have illustrated a way to do this in the reproducible example below and am grateful for any comments on the soundness of what I'm doing. I have not yet found a way to determine confidence intervals for the effect sizes derived below and would appreciate tips on that.

set.seed(123456) # Make session reproducible

# Set up the treatment factor with three levels and 100 observations
# each

treatment <- factor(c(rep(0, 100), rep(1, 100), rep(2, 100)))

# Simulate outcomes

outcome <- rep(NA, 300)

```outcome[treatment==0] <- rnorm(100, 10, 5) # baseline: mean=10, sd=5
outcome[treatment==1] <- rnorm(100, 30, 5) # effect size 4
outcome[treatment==2] <- rnorm(100, 40, 5) # effect size 6

```

# Check effect sizes (Cohen's d)

cohens.d <- function (x, y) {(mean(x)-mean(y))/sqrt((var(x)+var(y))/2) } cohens.d(outcome[treatment==1], outcome[treatment==0]) [1] 3.984774
cohens.d(outcome[treatment==2], outcome[treatment==0]) [1] 6.167798

# Sometimes standardized regression coefficients are recommended
# for determining effect size but that clearly doesn't work here:
coef(lm(scale(outcome) ~ treatment))
(Intercept) treatment1 treatment2
-1.233366 1.453152 2.246946
# The reason it doesn't work is that the difference of outcome
# means is divided by the sd of *all* outcomes:
(mean(outcome[treatment==1])-mean(outcome[treatment==0]))/sd(outcome) [1] 1.453152
(mean(outcome[treatment==2])-mean(outcome[treatment==0]))/sd(outcome) [1] 2.246946

```     (Intercept)       treatment1       treatment2 scale(covariate)
-0.1720456        0.1994251        0.3167116        0.8753761

```

# Proposed way to determine effect size: simulate outcomes for each
# treatment level assuming the covariate to have a fixed value (here
# its mean value after standardization: zero)
library(MCMCpack)
no.of.sims <- 10000
sims.model <- MCMCregress(model, mcmc=no.of.sims) sims.model[1:2,]

```     (Intercept) treatment1 treatment2 scale(covariate)      sigma2
[1,]  -0.1780735  0.2024111  0.3395233        0.8682119 0.002617449
[2,]  -0.1521623  0.1773623  0.2956053        0.8764573 0.003529013
sims.treat0 <- rnorm(no.of.sims, sims.model[,"(Intercept)"], sqrt(sims.model[,"sigma2"]))
```
sims.treat1 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + sims.model[,"treatment1"], sqrt(sims.model[,"sigma2"])) sims.treat2 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + sims.model[,"treatment2"], sqrt(sims.model[,"sigma2"]))

# Calculate Cohen's d for simulated values
cohens.d(sims.treat1, sims.treat0)
[1] 3.683093
cohens.d(sims.treat2, sims.treat0)
[1] 5.782622

These values are reasonably close to the ones (4 and 6) I plugged in at the beginning. It would be even nicer to have a confidence interval for them, but if I bootstrap one out of the simulated outcomes its width depends on the number of simulations and is therefore arbitrary. If anyone knew a better way to get at the effect sizes I'm looking for or how I could also get confidence intervals for them, that would be greatly appreciated.

Thanks,

Christoph

```--
Christoph Mathys, M.S.
Music and Neuroimaging Laboratory
Beth Israel Deaconess Medical Center
and Harvard Medical School

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help