Re: [Rd] How to safely using OpenMP pragma inside a .C() function?

From: Michael Lachmann <lachmann_at_eva.mpg.de>
Date: Thu, 01 Sep 2011 17:52:05 +0200

This is probably obvious, but I just wanted to say that it should be possible to turn off multithreading even when on a machine with multiple cores. Reasons could be because you run in a cluster, and are given just one core for yourself. Or, if you have a setup with trivial parallelization (i.e. you run almost the same task 100 times), where getting a speedup of 100 fold on 100 cores is easy, and multithreading would slow you down.

Michael

On 1 Sep 2011, at 8:34AM, Prof Brian Ripley wrote:

> Note that currently R internals do not actually use multiple threads in OpenMP, and there is no documented way to make them do so.
>
> The main issue is that there is insufficient knowlege of where they are worthwhile (which is both OS and platform-dependent: we don't even have reliable cross-platform ways to decide a reasonable number of threads, and the number of virtual cores on a multi-user platform definitely is not reasonable). Luke Tierney reported that the crossover point for a speed-up on Mac OS X was much larger matrices than on Linux, for example, and there is currently no OpenMP support in the Windows toolchain.
>
> The current implementation is a trial: there are more places planned to use OpenMP as and when the uncertainties are resolved.
>
> This will change at some point: given the current instability in thread support in the MinGW-w64 project this may or may not be before R 2.14.0.
>
> On Wed, 31 Aug 2011, Simon Urbanek wrote:
>

>> Pawel,
>> 
>> On Aug 31, 2011, at 4:46 PM, pawelm wrote:
>> 
>>> I just found this (performance improvement of the "dist" function when using
>>> openmp):

>
> You failed to describe the platform! See the posting guide (which asked you to do so 'at a minimum').
>
>>> .Internal(setMaxNumMathThreads(1)); .Internal(setNumMathThreads(1)); m <-
>>> matrix(rnorm(810000),900,900); system.time(d <- dist(m))
>>> 
>>> user  system elapsed
>>> 3.510   0.013   3.524
>>> 
>>> .Internal(setMaxNumMathThreads(5)); .Internal(setNumMathThreads(5)); m <-
>>> matrix(rnorm(810000),900,900); system.time(d <- dist(m));
>>> 
>>>  user  system elapsed
>>> 3.536   0.007   1.321
>>> 
>>> Works great! Just the question stays if it's a good practice to use
>>> "R_num_math_threads" in external packages?

>
> Most definitely not: it is never good practice to use undocumented non-API variables. See 'Writing R Extensions'.
>
>> Normally you don't need to mess with all this and I would recommend not to do so. The R internals use a different strategy since they need to cope with the fall-back case, but packages should not worry about that. The default number of threads is defined by the OMP_NUM_THREADS environment variable and that is the documented way in OpenMP, so my recommendation would be to not mess with num_threads() which is precisely why I did not use it in the example I gave you.

>
> I'd be cautious there. OMP_NUM_THREADS affects all the OpenMP code in the R session, and possibly others which use it (some parallel BLAS do too).
>
>> 
>> That said, R-devel has new facilities for parallelization so things may change in the future.
>> 
>> Cheers,
>> Simon

>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>


R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 01 Sep 2011 - 16:00:45 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 01 Sep 2011 - 22:40:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive