Re: [R] Help with Mahalanobis

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Mon 11 Jul 2005 - 03:11:15 EST

f <- function(a,b) mapply(function(a,b)     (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)

D2 <- outer(seq(nObjects), seq(nObjects), f)

This also eliminates dependence on gtools and the complexity of dealing with triangular matrices.

Regards.

Here it is in full:

D2Mah2 = function(y, x) {

stopifnot(is.data.frame(y), !missing(x))   stopifnot(dim(y)[1] != dim(x)[1])

```  y    = as.matrix(y)
x    = as.factor(x)
man  = manova(y ~ x)
E    = summary(man)\$SS[2] #Matrix E
S    = as.matrix(E\$Residuals)/man\$df.residual
InvS = solve(S)
mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

```

library(gtools)
nObjects = nrow(mds)

### changed part is next two statements   f <- function(a,b) mapply(function(a,b)     (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)

D2 <- outer(seq(nObjects), seq(nObjects), f) }

```#
# test
#
```

D2M2 = D2Mah2(iris[,1:4], iris[,5])
print(D2M2)

On 7/10/05, Jose Claudio Faria <joseclaudio.faria@terra.com.br> wrote:
> Well, as I did not get a satisfactory reply to the original question I tried to
> make a basic function that, I find, solve the question.
>
> I think it is not the better function, but it is working.
>
> So, perhaps it can be useful to other people.
>
> #
> # Calculate the matrix of Mahalanobis Distances between groups
> # from data.frames
> #
> # by: José Cláudio Faria
> # date: 10/7/05 13:23:48
> #
>
> D2Mah = function(y, x) {
>
> stopifnot(is.data.frame(y), !missing(x))
> stopifnot(dim(y)[1] != dim(x)[1])
> y = as.matrix(y)
> x = as.factor(x)
> man = manova(y ~ x)
> E = summary(man)\$SS[2] #Matrix E
> S = as.matrix(E\$Residuals)/man\$df.residual
> InvS = solve(S)
> mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
>
> colnames(mds) = names(y)
> Objects = levels(x)
> rownames(mds) = Objects
>
> library(gtools)
> nObjects = nrow(mds)
> comb = combinations(nObjects, 2)
>
> tmpD2 = numeric()
> for (i in 1:dim(comb)[1]){
> a = comb[i,1]
> b = comb[i,2]
> tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,])
> }
>
> # Thanks Gabor for the below
> tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects))
> tmpMah[lower.tri(tmpMah)] = tmpD2
> D2 = tmpMah + t(tmpMah)
> return(D2)
> }
>
> #
> # To try
> #
> D2M = D2Mah(iris[,1:4], iris[,5])
> print(D2M)
>
> Thanks all for the complementary aid (specially to Gabor).
>
> Regards,
> --
> Jose Claudio Faria
> Brasil/Bahia/UESC/DCET
> mails:
> joseclaudio.faria@terra.com.br
> jc_faria@uesc.br
> jc_faria@uol.com.br
> tel: 73-3634.2779
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help