From: McGehee, Robert <Robert.McGehee_at_geodecapital.com>

Date: Fri 26 May 2006 - 03:57:19 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 26 04:04:28 2006

Date: Fri 26 May 2006 - 03:57:19 EST

Moreno,

As much of my processor time is often spent doing basic linear algebra
operations (matrix inversion, quadratic programming, etc), I recently
recompiled R using a BLAS implementation (ATLAS) tuned for parallel
processing. The speed improvement for linear algebra operations was
significant on multi-processors.

For example, using:

system.time(x <- replicate(10, matrix(rnorm(N^2), N, N) %*%
matrix(rnorm(N^2), N, N)))

I benchmarked speed improvements of 10-20% where N is small (10-100) and speed improvements of up to 6x (e.g. 8 seconds vs 48 seconds) when N is large (1000+).

So for users with lots of linear algebra calculations interested in parallel processing, I'd recommend always starting with (re-)compiling a customized BLAS, if they have not done so already. ATLAS and GOTO are the two most common BLAS implementations that I know of.

As far as true parallel processing, I have not yet tried the before-mentioned R packages, but I did code up an internal package for parallel processing very large simulations in which a simple script is re-run on multiple data sets. In this example I stored each data set in a different numbered directory. The R script would go through each directory, in order, looking for a flag.txt file. If such a file does not exist, the processor puts a flag.txt in that directory, indicating that that directory is in use, and starts processing the data. This allows multiple processors/computers to work on very large simulations in parallel without duplicating work. At one point I was able to muster up 15-20 CPUs from spare Windows and Linux boxes to reduce the simulation time down from days to hours. Such a system would be also be easy to re-create without setting up MPI/PVM if your simulation / project can be divided up in a similar way.

Cheers,

Robert

-----Original Message-----

From: r-help-bounces@stat.math.ethz.ch

[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Martin Morgan
Sent: Thursday, May 25, 2006 1:17 PM

To: mb7312@libero.it

Cc: r-help

Subject: Re: [R] parallel computing

Hi Moreno --

snow provides an easy interface to simple parallel types of calculations (e.g., lapply in parallel). I quickly wanted to have more direct control over how parallel computations were calculated, and have been using Rmpi. Though in principle snow and Rmpi are 'easy' to use, I found that they actually require a certain amount of understanding about R objects and evaluation, and the underlying communication library (MPI, or PVM).

Hope that helps,

Martin

"mb7312@libero.it" <mb7312@libero.it> writes:

> Dear R users,

*>
**> I have access to a Sun cluster with multiple processors , a lot of
**> RAM and with RedHat installed. I want to take advantage of its
**> power for a R routine very time consuming.
**>
**> Whick package do I have to use? I know there are snow,snowFT and
**> others package.Which is the best for my purpose? Do someone have
**> experiences with this?
**>
**> Thanck in advance.
**>
**> Moreno
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide!
*

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 26 04:04:28 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Fri 26 May 2006 - 06:10:22 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*