Re: [R] Help with Mahalanobis

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Mon 11 Jul 2005 - 09:08:50 EST

And here is one more simplification using the buildin mahalanobis function:

D2Mah3 = function(y, x) {

 stopifnot(is.data.frame(y), !missing(x))  stopifnot(dim(y)[1] != dim(x)[1])

 y    = as.matrix(y)
 x    = as.factor(x)
 man  = manova(y ~ x)
 E    = summary(man)$SS[2] #Matrix E
 S    = as.matrix(E$Residuals)/man$df.residual
 InvS = solve(S)
 mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

 nObjects = nrow(mds)

 ### changed part is next two statements  f <- function(a,b) mapply(function(a,b)    mahalanobis(mds[a,],mds[b,],InvS,TRUE), a, b)

 D2 <- outer(seq(nObjects), seq(nObjects), f) }

#
# test
#

D2M3 = D2Mah3(iris[,1:4], iris[,5])

On 7/10/05, Gabor Grothendieck <ggrothendieck@gmail.com> wrote:
> I think you could simplify this by replacing everything after the
> nObjects = nrow(mds) line with just these two statements.
>
> f <- function(a,b) mapply(function(a,b)
> (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)
>
> D2 <- outer(seq(nObjects), seq(nObjects), f)
>
> This also eliminates dependence on gtools and the complexity
> of dealing with triangular matrices.
>
> Regards.
>
> Here it is in full:
>
> D2Mah2 = function(y, x) {
>
> stopifnot(is.data.frame(y), !missing(x))
> stopifnot(dim(y)[1] != dim(x)[1])
> y = as.matrix(y)
> x = as.factor(x)
> man = manova(y ~ x)
> E = summary(man)$SS[2] #Matrix E
> S = as.matrix(E$Residuals)/man$df.residual
> InvS = solve(S)
> mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
>
> library(gtools)
> nObjects = nrow(mds)
>
> ### changed part is next two statements
> f <- function(a,b) mapply(function(a,b)
> (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)
>
> D2 <- outer(seq(nObjects), seq(nObjects), f)
> }
>
> #
> # test
> #
> D2M2 = D2Mah2(iris[,1:4], iris[,5])
> print(D2M2)
>
>
>
>
> On 7/10/05, Jose Claudio Faria <joseclaudio.faria@terra.com.br> wrote:
> > Well, as I did not get a satisfactory reply to the original question I tried to
> > make a basic function that, I find, solve the question.
> >
> > I think it is not the better function, but it is working.
> >
> > So, perhaps it can be useful to other people.
> >
> > #
> > # Calculate the matrix of Mahalanobis Distances between groups
> > # from data.frames
> > #
> > # by: José Cláudio Faria
> > # date: 10/7/05 13:23:48
> > #
> >
> > D2Mah = function(y, x) {
> >
> > stopifnot(is.data.frame(y), !missing(x))
> > stopifnot(dim(y)[1] != dim(x)[1])
> > y = as.matrix(y)
> > x = as.factor(x)
> > man = manova(y ~ x)
> > E = summary(man)$SS[2] #Matrix E
> > S = as.matrix(E$Residuals)/man$df.residual
> > InvS = solve(S)
> > mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
> >
> > colnames(mds) = names(y)
> > Objects = levels(x)
> > rownames(mds) = Objects
> >
> > library(gtools)
> > nObjects = nrow(mds)
> > comb = combinations(nObjects, 2)
> >
> > tmpD2 = numeric()
> > for (i in 1:dim(comb)[1]){
> > a = comb[i,1]
> > b = comb[i,2]
> > tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,])
> > }
> >
> > # Thanks Gabor for the below
> > tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects))
> > tmpMah[lower.tri(tmpMah)] = tmpD2
> > D2 = tmpMah + t(tmpMah)
> > return(D2)
> > }
> >
> > #
> > # To try
> > #
> > D2M = D2Mah(iris[,1:4], iris[,5])
> > print(D2M)
> >
> > Thanks all for the complementary aid (specially to Gabor).
> >
> > Regards,
> > --
> > Jose Claudio Faria
> > Brasil/Bahia/UESC/DCET
> > Estatistica Experimental/Prof. Adjunto
> > mails:
> > joseclaudio.faria@terra.com.br
> > jc_faria@uesc.br
> > jc_faria@uol.com.br
> > tel: 73-3634.2779
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Jul 11 10:15:36 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:27 EST