From: Martin Morgan <mtmorgan_at_fhcrc.org>

Date: Mon, 30 Jun 2008 06:02:40 -0700

Date: Mon, 30 Jun 2008 06:02:40 -0700

"Juan Pablo Romero Méndez" <jpablo.romero_at_gmail.com> writes:

> Thanks!

*>
**> It turned out that Rmpi was a good option for this problem after all.
**>
**> Nevetheless, pnmath seems very promising, although it doesn't load in my system:
**>
**>
**>> library(pnmath)
**> Error in dyn.load(file, DLLpath = DLLpath, ...) :
**> unable to load shared library
**> '/home/jpablo/extra/R-271/lib/R/library/pnmath/libs/pnmath.so':
**> libgomp.so.1: shared object cannot be dlopen()ed
**> Error: package/namespace load failed for 'pnmath'
*

Yes, in the pnmath README it says

On Redhat EL 5 I have run into a problem where attempting to dlopen libgomp.so fails. A workaround is to link R.bin with -lgomp. This is not an issue on Fedora 7, so probably will go away at some point.

This is the problem you encountered. I think (out of my depth here) that the issue is here to stay, rather than something unique to RHEL 5. The somewhat cryptic solution is 'to link R.bin with -lgomp'. I hesitate to give public advice on the black art of configuring R, but I translate that to mean building R with

% cd somedir % LIBS=-lgomp ~/path/to/R-source/configure % make -j4

I don't know what the deeper issues are to doing things this way.

Martin

> I find it odd, because libgomp.so.1 is in /usr/lib, so R should find it.

*>
**>
**> Juan Pablo
**>
**>
**> On Sun, Jun 29, 2008 at 1:36 AM, Martin Morgan <mtmorgan_at_fhcrc.org> wrote:
**>> "Juan Pablo Romero Méndez" <jpablo.romero_at_gmail.com> writes:
**>>
**>>> Hello,
**>>>
**>>> The problem I'm working now requires to operate on big matrices.
**>>>
**>>> I've noticed that there are some packages that allows to run some
**>>> commands in parallel. I've tried snow and NetWorkSpaces, without much
**>>> success (they are far more slower that the normal functions)
**>>
**>> Do you mean like this?
**>>
**>>> library(Rmpi)
**>>> mpi.spawn.Rslaves(nsl=2) # dual core on my laptop
**>>> m <- matrix(0, 10000, 1000)
**>>> system.time(x1 <- apply(m, 2, sum), gcFirst=TRUE)
**>> user system elapsed
**>> 0.644 0.148 1.017
**>>> system.time(x2 <- mpi.parApply(m, 2, sum), gcFirst=TRUE)
**>> user system elapsed
**>> 5.188 2.844 10.693
**>>
**>> ? (This is with Rmpi, a third alternative you did not mention;
**>> 'elapsed' time seems to be relevant here.)
**>>
**>> The basic problem is that the overhead of dividing the matrix up and
**>> communicating between processes outweighs the already-efficient
**>> computation being performed.
**>>
**>> One solution is to organize your code into 'coarse' grains, so the FUN
**>> in apply does (considerably) more work.
**>>
**>> A second approach is to develop a better algorithm / use an
**>> appropriate R paradigm, e.g.,
**>>
**>>> system.time(x3 <- colSums(m), gcFirst=TRUE)
**>> user system elapsed
**>> 0.060 0.000 0.088
**>>
**>> (or even faster, x4 <- rep(0, ncol(m)) ;)
**>>
**>> A third approach, if your calculations make heavy use of linear
**>> algebra, is to build R with a vectorized BLAS library; see the R
**>> Installation and Administration guide.
**>>
**>> A fourth possibility is to use Tierney's 'pnmath' library mentioned in
**>> this thread
**>>
**>> https://stat.ethz.ch/pipermail/r-help/2007-December/148756.html
**>>
**>> The README file needs to be consulted for the not-exactly-trivial (on
**>> my system) task of installing the package. Specific functions are
**>> parallelized, provided the length of the calculation makes it seem
**>> worth-while.
**>>
**>>> system.time(exp(m), gcFirst=TRUE)
**>> user system elapsed
**>> 0.108 0.000 0.106
**>>> library(pnmath)
**>>> system.time(exp(m), gcFirst=TRUE)
**>> user system elapsed
**>> 0.096 0.004 0.052
**>>
**>> (elapsed time about 2x faster). Both BLAS and pnmath make much better
**>> use of resources, since they do not require multiple R instances.
**>>
**>> None of these approaches would make a colSums faster -- the work is
**>> just too small for the overhead.
**>>
**>> Martin
**>>
**>>> My problem is very simple, it doesn't require any communication
**>>> between parallel tasks; only that it divides simetricaly the task
**>>> between the available cores. Also, I don't want to run the code in a
**>>> cluster, just my multicore machine (4 cores).
**>>>
**>>> What solution would you propose, given your experience?
**>>>
**>>> Regards,
**>>>
**>>> Juan Pablo
**>>>
**>>> ______________________________________________
**>>> R-help_at_r-project.org mailing list
**>>> https://stat.ethz.ch/mailman/listinfo/r-help
**>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**>>> and provide commented, minimal, self-contained, reproducible code.
**>>
**>> --
**>> Martin Morgan
**>> Computational Biology / Fred Hutchinson Cancer Research Center
**>> 1100 Fairview Ave. N.
**>> PO Box 19024 Seattle, WA 98109
**>>
**>> Location: Arnold Building M2 B169
**>> Phone: (206) 667-2793
**>>
*

-- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793 ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Mon 30 Jun 2008 - 13:58:07 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 30 Jun 2008 - 14:30:49 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*