# Re: [R] Pooled Covariance Matrix

From: Murray Jorgensen <maj_at_waikato.ac.nz>
Date: Wed 20 Sep 2006 - 21:13:38 GMT

Thank you, Professor Ripley. Murray Jorgensen

Prof Brian Ripley wrote:
> On Wed, 20 Sep 2006, Murray Jorgensen wrote:
>

```>> I am in a discriminant analysis situation with a frame containing
>> several variables and a grouping factor, if you like:
>>
>> set.seed(200906)
>> exampledf <- as.data.frame(matrix(rnorm(50,5,2),nrow=10,ncol=5))
>> exampledf\$Group <- factor(rep(c(1,2,3),c(3,3,4)))
>> exampledf
>>
>> I'm sure there must be a simple way to get the within group pooled
>> covariance matrix but I haven't found it yet.
```

>
> There are two versions of this, weighted and unweighted, and the
> difference caused confusion in the early discriminant analysis
> literature. (See MASS4 p.333.) The weighted version is conventional.
>
> Suppose you have a matrix X and a grouping factor g. Then either of
>
> group.means <- rowsum(X, g)/as.vector(table(g))
> group.means <- tapply(X, list(rep(g, ncol(X)), col(X)), mean)
>
> gives the group means, and var(X - group.means[g,]) seems to be what you
> want.
>
```>> I started thinking that one might begin by forming a frame with the same
>>  dimensions but containing the group means. But then I found a thread
>> from two years back called "Getting the groupmean for each person" which
>> seemed to imply that doing this was a bit subtle even for ncol=1. Hence
>> I will risk a question to the list.
```

>
> That thread seems to be about efficiency for very large matrices on R of
> two years' ago.
>
```--
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj@waikato.ac.nz                                Fax 7 838 4155
Phone  +64 7 838 4773 wk    Home +64 7 825 0441    Mobile 021 1395 862

