Re: [R] More than doubling performance with snow

From: Hesen Peng <hesen.peng_at_emory.edu>
Date: Tue, 25 Nov 2008 15:27:34 -0500

I see. Thank you very much.

On Mon, Nov 24, 2008 at 10:12 AM, Stefan Evert <stefan.evert_at_uos.de> wrote:
>
>> I'm sorry but I don't quite understand what "not running solve() in
>> this process" means. I updated the code and it do show that the result
>> from clusterApply() are identical with the result from lapply(). Could
>> you please explain more about this?
>
> The point is that a parallel processing framework like Snow and PVM does not
> execute the operation in your (interactive) R session, but rather starts
> separate computing processes that carry out the actual calculation (while
> your R session is just waiting for the results to become available). These
> separate processes can either run on different computers in a network, or on
> your local machine (in order to make use of multiple CPU cores).
>
>>>> user system elapsed
>>>> 0.584 0.144 4.355
>
>>>> user system elapsed
>>>> 4.777 0.100 4.901
>
>
> If you take a close look at your timing results, you can see that the total
> processing time ("elapsed") is only slightly shorter with parallelisation
> (4.35 s) than without (4.9 s). You've probably been looking at "user" time,
> i.e. the amount of CPU time your interactive R session consumed. Since with
> parallel processing, the R session itself doesn't perform the actual
> calculation (as explained above), it is mostly waiting for results to become
> available and "user" time is therefore reduced drastically. In short, when
> measuring performance improvements from parallelisation, always look at the
> total "elapsed" time.
>
> So why isn't parallel processing twice as fast as performing the caculation
> in a single thread? Perhaps the advantage of using both CPU cores was eaten
> up by the communication overhead. You should also take into account that a
> lot of other processes (terminals, GUI, daemons, etc.) are running on your
> computer at the same time, so even with parallel processing you will not
> have both cores fully available to R. In my experience, there is little
> benefit in parallelisation as long as you just have two CPU cores on your
> computer (rather than, say, 8 cores).

>
> Hope this clarifies things a bit (and is reasonably accurate, since I don't
> have much experience with parallelisation),
> Stefan
>
> [ stefan.evert@uos.de | http://purl.org/stefan.evert ]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
彭河森 Hesen Peng
http://hesen.peng.googlepages.com/
______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 25 Nov 2008 - 20:41:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 25 Nov 2008 - 21:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive