Re: [R] Compare two distance matrices

From: <bady_at_univ-lyon1.fr>
Date: Thu 06 Oct 2005 - 22:39:27 EST

Hi, hi all,

> I am trying to compare two distance matrices with R. I would like to
> create a XY plot of these matrices and do some linear regression on
> it. But, I am a bit new to R, so i have a few questions (I searched in
> the documentation with no success).
> The first problem is loading a distance matrix into R. This matrix is
> the output of a the Phylip program Protdist and lookes like this:
> I tried with the scan() function to load the files, but with no
> success. How should i load in these files? ....
>


you can separately load each matrix with two text files.

require(ade4)

mat1 <- read.table("mat1.txt")
nam1 <- mat1[,1]
mat1 <- mat1[,-1]

row.names(mat1) <- names(mat1) <- nam1
mat2 <- read.table("mat2.txt")
nam2 <- mat2[,1]
mat2 <- mat2[,-1]

row.names(mat2) <- names(mat2) <- nam2

dist1 <- mat2dist(mat1)
dist2 <- mat2dist(mat2)

# with mat1:
# n_crassa 0.000000 0.690737 0.895257 0.882576 2.365386
# c_neufor 0.690737 0.000000 0.956910 0.979988 2.103041
# a_thaliana 0.895257 0.956910 0.000000 1.003668 2.724847
# pompep 0.882576 0.979988 1.003668 0.000000 2.065202
# s_cerevis 2.365386 2.103041 2.724847 2.065202 0.000000

# and mat2:
# n_crassa 0.000000 0.739560 0.933986 0.861644 2.207467
# c_neufor 0.739560 0.000000 0.988779 0.925168 1.941141
# a_thaliana 0.933986 0.988779 0.000000 1.007803 2.415320
# pompep 0.861644 0.925168 1.007803 0.000000 2.394490
# s_cerevis 2.207467 1.941141 2.415320 2.394490 0.000000

I think that it’s the more simple solution, NOT the optimal solution. Maybe, there is an interface to read Phylip file in Bioconductor project(?).

To transform matrix to dist, you can use the function mat2dist of the library ade4 (see example)

To compare 2 distances matrices, there are many possibility ! For example, if the distances matrices are Euclidian, you can used directly principal coordinate analyses (dudi.pco), and co-inertia analysis or procuste analysis,etc ...

# examples:

# Data preparation

require(ade4)

mat1 <- read.table("mat1.txt")
nam1 <- mat1[,1]
mat1 <- mat1[,-1]

row.names(mat1) <- names(mat1) <- nam1
mat2 <- read.table("mat2.txt")
nam2 <- mat2[,1]
mat2 <- mat2[,-1]

row.names(mat2) <- names(mat2) <- nam2

? mat2dist
dist1 <- mat2dist(mat1)
dist2 <- mat2dist(mat2)

dist1
dist2

is.euclid(dist1)
is.euclid(dist2)
# cool the distances matrices are euclidian :)

# to compare the 2 matrices
# example 1 : mantel test

? mantel.randtest
mt1 <- mantel.randtest(dist1,dist2,nrepet=10) plot(mt1)

# Example 2: coefficient of vectorial correlation (Escoufier, 1973)
? RVdist.randtest
RV1 <- RVdist.randtest(dist1,dist2,nrepet=10) plot(RV1)

# Example 3: coinertia analysis

?dudi.pco
# ?cmdscale

?coinertia
?randtest.coinertia

pco1 <- dudi.pco(dist1,nf=3,scannf=F)
pco2 <- dudi.pco(dist2,nf=3,scannf=F)
co1 <- coinertia(pco1,pco2,nf=3,scannf=F)
plot(co1)
testco1 <-randtest.coinertia(co1,nrepet=10) plot(testco1)

# Example 4: procuste analysis

? procuste
? procuste.randtest
pco1 <- dudi.pco(dist1,nf=3,scannf=F)
pco2 <- dudi.pco(dist2,nf=3,scannf=F)
proc1 <- procuste(pco1$tab,pco2$tab)
plot(proc1)
testproc1 <-procuste.randtest(pco1$tab,pco2$tab,nrepet=10) plot(testproc1)

....

the choice depend to your outcome.

hopes this help

P.BADY



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Oct 06 22:45:15 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:37 EST