Re: [R] Rmpi performance

From: Martin Morgan <mtmorgan_at_fhcrc.org>
Date: Fri 13 Oct 2006 - 16:40:51 GMT

clusterCall invokes the same function on all three nodes. You have basically discovered the communication costs of performing the calculation in parallel.

You'll get the easiest gains from snow (and other parallel packages in R) with 'embarrassingly parallel' problems, where the same algorithm is applied to different data sets / slices of data. For performance gains from a single call to op_mat, you'd have to do some serious parallel algorithm development to distribute the data and computations effectively.

Hope that helps,

Martin

Michela Cameletti <michela.cameletti@unibg.it> writes:

> Dear R users,
> we are trying to do some parallel computing using library(snow).
> In particular we have a cluster with 3 nodes
>
>>cl <- makeCluster(3, type = "MPI")
> 3 slaves are spawned successfully. 0 failed.
>
>
> and we want to compute the function op_mat (see below) first with the
> master and then with the cluster using system.time for checking the
> computational performance.
>
> op_mat = function(mat) {
>
> + inv = solve(mat)
> + det_inv = det(inversa)
> + tr_inv = sum(diag(inversa))
> + return(list(c(det=det_inv,tr=tr_inv)))
> + }
>
>>nn = 3000
>>XX = matrix(rnorm(nn*nn),nn,nn)
> # with the master
>> system.time(op_matrici(XX))
> [1] 42.283 1.883 44.168 0.000 0.000
> # with the cluster
>> system.time(clusterCall(cl,op_matrici,XX))
> [1] 11.523 12.612 71.562 0.000 0.000
>
> You can see that using the master it takes 44.168 seconds for computing
> the function on matrix XX while it takes 71.562 seconds (more time!!!)
> with the cluster. Can you give us some advice in order to understand why
> the cluster is slower than the master?
> Thank you very much in advance,
> bye
> Michela and Marco
> Ps: we have a gigabit ethernet between the master and the nodes
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Oct 14 03:37:04 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 13 Oct 2006 - 18:30:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.