# [R] Sampling from multivariate multiple regression prediction regions

From: Iain Pardoe <ipardoe_at_lcbmail.uoregon.edu>
Date: Tue 10 May 2005 - 02:43:36 EST

I'd like to sample multiple response values from a multivariate regression fit. For example, suppose I have m=2 responses (y1,y2) and a single set of predictor variables (z1,z2). Each response is assumed to follow its own regression model, and the error terms in each model can be correlated (as in example 7.10 of section 7.7 of Johnson/Wichern):

> ex7.10 <-

```+   data.frame(y1 = c(141.5, 168.9, 154.8, 146.5, 172.8, 160.1, 108.5),
+              y2 = c(301.8, 396.1, 328.2, 307.4, 362.4, 369.5, 229.1),
+              z1 = c(123.5, 146.1, 133.9, 128.5, 151.5, 136.2, 92),
+              z2 = c(2.108, 9.213, 1.905, .815, 1.061, 8.603, 1.125))
```

> attach(ex7.10)
> f.mlm <- lm(cbind(y1,y2)~z1+z2)
> y.hat <- c(1, 130, 7.5) %*% coef(f.mlm)
> round(y.hat, 2)

y1 y2
[1,] 151.84 349.63
> qf.z <- t(c(1, 130, 7.5)) %*%
+ solve(t(cbind(1,z1,z2)) %*% cbind(1,z1,z2)) %*% + c(1, 130, 7.5)
> round(qf.z, 5)

[,1]
[1,] 0.36995
> n.sigma.hat <- SSD(f.mlm)\$SSD # same as t(resid(f.mlm)) %*%
resid(f.mlm)
> round(n.sigma.hat, 2)

y1 y2
y1 5.80 5.22
y2 5.22 12.57
> F.quant <- qf(.95,2,3)
> round(F.quant, 2)
 9.55

This gives me all the information I need to calculate a 95% confidence ellipse for y=(y1,y2) at (z1,z2)=(130,7.5) using JW's equation (7-48) (written using R syntax, although R cannot "literally" calculate this as it is written):

(y-y.hat) %*% ((n-r-1) * solve(n.sigma.hat)) %*% t(y-y.hat) <= (1+qf.z) * (m*(n-r-1)/(n-r-m)) * F.quant

But, what if instead I'd like to sample (y1,y2) values from this distribution? I can sample from an F(m,n-r-m) distribution easily enough, but then how can I transform this to a single point in (y1,y2) space?

Any ideas would be gratefully appreciated. Thanks.

Iain Pardoe <ipardoe@lcbmail.uoregon.edu> University of Oregon

R-help@stat.math.ethz.ch mailing list