Re: [R] Help with Mahalanobis

From: Jose Claudio Faria <joseclaudio.faria_at_terra.com.br>
Date: Mon 11 Jul 2005 - 09:13:05 EST

Indeed, it is very nice Gabor (as always)!

So, a doubt: how to preserve the 'rowname' and 'colname' of D2, like in the first function? I think it is useful to posterior analyzes (as cluster, for example).

Regards,

# A small correction (reference to gtools was eliminated) D2Mah2 = function(y, x) {

   stopifnot(is.data.frame(y), !missing(x))    stopifnot(dim(y)[1] != dim(x)[1])

   y    = as.matrix(y)
   x    = as.factor(x)
   man  = manova(y ~ x)
   E    = summary(man)$SS[2] #Matrix E
   S    = as.matrix(E$Residuals)/man$df.residual
   InvS = solve(S)
   mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
   nObjects = nrow(mds)
   f = function(a,b) mapply(function(a,b)      (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a, b)    D2 = outer(seq(nObjects), seq(nObjects), f) }
#
# test
#

D2M2 = D2Mah2(iris[,1:4], iris[,5])
print(D2M2)

Gabor Grothendieck wrote:
> I think you could simplify this by replacing everything after the
> nObjects = nrow(mds) line with just these two statements.
>
> f <- function(a,b) mapply(function(a,b)
> (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)
>
> D2 <- outer(seq(nObjects), seq(nObjects), f)
>
> This also eliminates dependence on gtools and the complexity
> of dealing with triangular matrices.
>
> Regards.
>
> Here it is in full:
>
> D2Mah2 = function(y, x) {
>
> stopifnot(is.data.frame(y), !missing(x))
> stopifnot(dim(y)[1] != dim(x)[1])
> y = as.matrix(y)
> x = as.factor(x)
> man = manova(y ~ x)
> E = summary(man)$SS[2] #Matrix E
> S = as.matrix(E$Residuals)/man$df.residual
> InvS = solve(S)
> mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
>
> library(gtools)
> nObjects = nrow(mds)
>
> ### changed part is next two statements
> f <- function(a,b) mapply(function(a,b)
> (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)
>
> D2 <- outer(seq(nObjects), seq(nObjects), f)
> }
>
> #
> # test
> #
> D2M2 = D2Mah2(iris[,1:4], iris[,5])
> print(D2M2)
>
>
>
>
> On 7/10/05, Jose Claudio Faria <joseclaudio.faria@terra.com.br> wrote:
>

>>Well, as I did not get a satisfactory reply to the original question I tried to
>>make a basic function that, I find, solve the question.
>>
>>I think it is not the better function, but it is working.
>>
>>So, perhaps it can be useful to other people.
>>
>>#
>># Calculate the matrix of Mahalanobis Distances between groups
>># from data.frames
>>#
>># by: José Cláudio Faria
>># date: 10/7/05 13:23:48
>>#
>>
>>D2Mah = function(y, x) {
>>
>>  stopifnot(is.data.frame(y), !missing(x))
>>  stopifnot(dim(y)[1] != dim(x)[1])
>>  y    = as.matrix(y)
>>  x    = as.factor(x)
>>  man  = manova(y ~ x)
>>  E    = summary(man)$SS[2] #Matrix E
>>  S    = as.matrix(E$Residuals)/man$df.residual
>>  InvS = solve(S)
>>  mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
>>
>>  colnames(mds) = names(y)
>>  Objects       = levels(x)
>>  rownames(mds) = Objects
>>
>>  library(gtools)
>>  nObjects = nrow(mds)
>>  comb     = combinations(nObjects, 2)
>>
>>  tmpD2 = numeric()
>>  for (i in 1:dim(comb)[1]){
>>    a = comb[i,1]
>>    b = comb[i,2]
>>    tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,])
>>  }
>>
>>  # Thanks Gabor for the below
>>  tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects))
>>  tmpMah[lower.tri(tmpMah)] = tmpD2
>>  D2 = tmpMah + t(tmpMah)
>>  return(D2)
>>}
>>
>>#
>># To try
>>#
>>D2M = D2Mah(iris[,1:4], iris[,5])
>>print(D2M)
>>
>>Thanks all for the complementary aid (specially to Gabor).
>>
>>Regards,
>>--
>>Jose Claudio Faria
>>Brasil/Bahia/UESC/DCET
>>Estatistica Experimental/Prof. Adjunto
>>mails:
>> joseclaudio.faria@terra.com.br
>> jc_faria@uesc.br
>> jc_faria@uol.com.br
>>tel: 73-3634.2779
>>
>>______________________________________________
>>R-help@stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>

>
>
> Esta mensagem foi verificada pelo E-mail Protegido Terra.
> Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - Dat 4531
> Proteja o seu e-mail Terra: http://mail.terra.com.br/
>
>
>
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  joseclaudio.faria@terra.com.br
  jc_faria@uesc.br
  jc_faria@uol.com.br
tel: 73-3634.2779

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Mon Jul 11 10:25:00 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:27 EST