Re: [R] applying cor.test to a (m, n) matrix - SUMMARY

From: Monica Pisica <pisicandru_at_hotmail.com>
Date: Fri, 09 May 2008 14:56:31 +0000

Hi again,

I've got few very good options from the list and since they were not posted to the list, I will provide a summary. Thank you very much to all who answered and I hope this summary will benefit others interested in solving similar problems like that.

Yasir Kaheil re-wrote my original code in a more streamlined way: pr<-array(0,c(dim(x)[2],dim(x)[2]));
for (i in 1:dim(x)[2]) for (j in 1:dim(x)[2]) pr[i,j]<-cor.test(x[,i],x[,j])$p.val;

If column names are of importance they can be added like: rownames(pr) <- colnames(x)
colnames(pr) <- colnames(x)

Jorge Ivan Velez sent 2 different solutions, and I am not quite sure which I like better. I think the first one is very nice if such a table needs to be published since the upper diagonal correspond to p values and lower diagonal to correlation values (this will satisfy any finicky editor who really, REALLY wants the p-values;-)). The second function is very good for looking in one sweep at the results of cor.test() applied to a matrix.

cor.pvalue <- function(X,method="pearson", use="complete" ) { dfr = nrow(X) - 2
R <- cor(X,method=method, use= use)
above <- row(R) < col(R)
r2 <- R[above]^2
Fstat <- r2 * dfr / (1 - r2)
R[above] <- 1 - pf(Fstat, 1, dfr)
R
}

# Example
m <- matrix(rnorm(10*5), ncol=5); colnames(m)=paste("m",1:ncol(m),sep="") cor.pvalue(m)

            m1 m2 m3 m4 m5

m1  1.00000000  0.24244042  0.93991694 0.5563147 0.9154130
m2  0.40750935  1.00000000  0.77016121 0.9745312 0.3762165
m3  0.02748730  0.10626089  1.00000000 0.3336439 0.8325926
m4 -0.21211652 -0.01164448 -0.34184398 1.0000000 0.2059275
m5 -0.03872636  0.31444921  0.07698371 0.4376249 1.0000000


correl.stats=function(X, method = "pearson", use = "complete" , conf.level = 0.95){ require(forward)
combs=t(fwd.combn(colnames(X), 2))
temp=t(apply(combs,1, function(x){
Y=X[,as.character(x)]
res=cor.test(Y[,1],Y[,2], use = use, method = method, conf.level = conf.level) temp2=c(res$estimate, res$statistic, res$p.value, res$conf.int[1:2]) names(temp2)=c('rho','statistic','pvalue','lower','upper') rm(res)
temp2
}
)
)
rownames(temp)=paste(combs[,1],combs[,2],sep="") temp
}

# Example
correl.stats(m)

             rho statistic pvalue lower upper

m1m2  0.40750935  1.26216512 0.2424404 -0.2987766 0.8253647
m1m3  0.02748730  0.07777522 0.9399169 -0.6127436 0.6459346
m1m4 -0.21211652 -0.61392640 0.5563147 -0.7425696 0.4818648
m1m5 -0.03872636 -0.10961692 0.9154130 -0.6524440 0.6056680
m2m3  0.10626089  0.30226250 0.7701612 -0.5608917 0.6897404
m2m4 -0.01164448 -0.03293778 0.9745312 -0.6366034 0.6225461
m2m5  0.31444921  0.93692273 0.3762165 -0.3929818 0.7880525
m3m4 -0.34184398 -1.02886285 0.3336439 -0.7994101 0.3667110
m3m5  0.07698371  0.21839093 0.8325926 -0.5807942 0.6739433
m4m5  0.43762492  1.37661090 0.2059275 -0.2650270 0.8367053

Phil Spector did a wonderful job in providing the list of matrices with the values out of cor.test(). Strangely enough the correlation values of a column with itself is 0 instead of 1, although all the other correlations have correct values. And of course this changes the lower and upper limits to 0 instead of 1 as well, although the p-value is correct in this case. Sincerely I don't see why that is. I will provide his example as well with the same m matrix I used above:

mydat = m
n = dim(mydat)[2]  

makemat = function(fun){
cc = combn(n,2)
results = apply(cc,2,function(x)cor.test(mydat[,x[1]],mydat[,x[2]])) answer = matrix(0,n,n)
now = sapply(results,fun)
answer[t(cc)] = now
answer[t(cc)[,c(2,1)]] = now
answer
}  

lapply(list(estimate=function(x)x[['estimate']],

pvalue = function(x)x[['p.value']],
lower = function(x)x[['conf.int']][1],
upper = function(x)x[['conf.int']][2]),makemat)
$estimate
            [,1]        [,2]        [,3]        [,4]        [,5]

[1,] 0.00000000 0.40750935 0.02748730 -0.21211652 -0.03872636
[2,] 0.40750935 0.00000000 0.10626089 -0.01164448 0.31444921
[3,] 0.02748730 0.10626089 0.00000000 -0.34184398 0.07698371
[4,] -0.21211652 -0.01164448 -0.34184398 0.00000000 0.43762492
[5,] -0.03872636 0.31444921 0.07698371 0.43762492 0.00000000

$pvalue

          [,1] [,2] [,3] [,4] [,5]
[1,] 0.0000000 0.2424404 0.9399169 0.5563147 0.9154130
[2,] 0.2424404 0.0000000 0.7701612 0.9745312 0.3762165
[3,] 0.9399169 0.7701612 0.0000000 0.3336439 0.8325926
[4,] 0.5563147 0.9745312 0.3336439 0.0000000 0.2059275
[5,] 0.9154130 0.3762165 0.8325926 0.2059275 0.0000000

$lower

           [,1] [,2] [,3] [,4] [,5]
[1,] 0.0000000 -0.2987766 -0.6127436 -0.7425696 -0.6524440
[2,] -0.2987766 0.0000000 -0.5608917 -0.6366034 -0.3929818
[3,] -0.6127436 -0.5608917 0.0000000 -0.7994101 -0.5807942
[4,] -0.7425696 -0.6366034 -0.7994101 0.0000000 -0.2650270
[5,] -0.6524440 -0.3929818 -0.5807942 -0.2650270 0.0000000

$upper

          [,1] [,2] [,3] [,4] [,5]
[1,] 0.0000000 0.8253647 0.6459346 0.4818648 0.6056680
[2,] 0.8253647 0.0000000 0.6897404 0.6225461 0.7880525
[3,] 0.6459346 0.6897404 0.0000000 0.3667110 0.6739433
[4,] 0.4818648 0.6225461 0.3667110 0.0000000 0.8367053
[5,] 0.6056680 0.7880525 0.6739433 0.8367053 0.0000000
 

Dimitris Rizopoulos suggested to look at rcor.test() from ltm, which I will do shortly.

Thank you again for all your thorough answers,

Monica


esh_messenger_052008



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 09 May 2008 - 16:13:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 09 May 2008 - 16:30:38 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive