Re: [R] Very slow: using double apply and cor.test to compute correlation p.values for 2 matrices

From: jim holtman <jholtman_at_gmail.com>
Date: Wed, 26 Nov 2008 09:14:40 -0500

Your time is being taken up in cor.test because you are calling it 100,000 times. So grin and bear it with the amount of work you are asking it to do.

Here I am only calling it 100 time:

> m1 <- matrix(rnorm(10000), ncol=100)
> m2 <- matrix(rnorm(10000), ncol=100)
> Rprof('/tempxx.txt')
> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor.test(x,y)$p.value }) }))

   user system elapsed
   8.86 0.00 8.89
>

so my guess is that calling it 100,000 times will take: 100,000 * 0.0886 seconds or about 3 hours.

If you run Rprof, you will see if is spending most of its time there:

  0 8.8 root

  1. 8.8 apply
  2. . 8.8 FUN
  3. . . 8.8 apply
  4. . . . 8.7 FUN
  5. . . . . 8.6 cor.test
  6. . . . . . 8.4 cor.test.default
  7. . . . . . . 2.4 match.arg
  8. . . . . . . . 1.7 eval
  9. . . . . . . . . 1.4 deparse
  10. . . . . . . . . . 0.6 .deparseOpts
  11. . . . . . . . . . . 0.2 pmatch
  12. . . . . . . . . . . 0.1 sum
  13. . . . . . . . . . 0.5 %in%
  14. . . . . . . . . . . 0.3 match
  15. . . . . . . . . . . . 0.3 is.factor
  16. . . . . . . . . . . . . 0.3 inherits
  17. . . . . . . . 0.2 formals
  18. . . . . . . . . 0.2 sys.function
  19. . . . . . . 2.1 cor
  20. . . . . . . . 1.1 match.arg
  21. . . . . . . . . 0.7 eval
  22. . . . . . . . . . 0.6 deparse
  23. . . . . . . . . . . 0.3 .deparseOpts
  24. . . . . . . . . . . . 0.1 pmatch
  25. . . . . . . . . . . 0.2 %in%
  26. . . . . . . . . . . . 0.2 match
  27. . . . . . . . . . . . . 0.1 is.factor
  28. . . . . . . . . . . . . . 0.1 inherits
  29. . . . . . . . . 0.1 formals
  30. . . . . . . . 0.5 stopifnot
  31. . . . . . . . . 0.2 match.call
  32. . . . . . . . 0.1 pmatch
  33. . . . . . . . 0.1 is.data.frame
  34. . . . . . . . . 0.1 inherits
  35. . . . . . . 1.5 paste
  36. . . . . . . . 1.4 deparse
  37. . . . . . . . . 0.6 .deparseOpts
  38. . . . . . . . . . 0.3 pmatch
  39. . . . . . . . . . 0.1 any
  40. . . . . . . . . 0.6 %in%
  41. . . . . . . . . . 0.6 match
  42. . . . . . . . . . . 0.5 is.factor
  43. . . . . . . . . . . . 0.4 inherits
  44. . . . . . . . . . . . . 0.2 mode
  45. . . . . . . 0.4 switch
  46. . . . . . . . 0.1 qnorm
  47. . . . . . . 0.2 pt
  48. . . . . 0.1 $

On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan <daren76_at_hotmail.com> wrote:
>
> My two matrices are roughly the sizes of m1 and m2. I tried using two apply and cor.test to compute the correlation p.values. More than an hour, and the codes are still running. Please help to make it more efficient.
>
> m1 <- matrix(rnorm(100000), ncol=100)
> m2 <- matrix(rnorm(10000000), ncol=100)
>
> cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor.test(x,y)$p.value }) })
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 26 Nov 2008 - 14:17:30 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 26 Nov 2008 - 16:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive