[Rd] naive question regarding running parallel C code from R

From: tyler <tyler.smith_at_mail.mcgill.ca>
Date: Fri, 18 Apr 2008 13:03:25 -0300


I have only the vaguest notions of what parallel programing, but I think I have a situation where it might be of use to me, or at least provide me with the opportunity to learn more about it. Before I invest in figuring out the nuts and bolts, can anyone confirm that this is a sane approach, or provide alternatives that I could pursue?

I'm running stochastic simulations, with the actual simulation in C code, with an R interface to set up the parameters, format the output, and save the resulting objects periodically through the run. The basic layout is:

R function sets up the run
R for(c = 0; c < CYCLES; c++)
  call C function
  C for(i = 0; i < TIME; i++)
    immigration loop adds individuals to the recruit vector     birth loop adds individuals to the recruit vector     recruit vector is added to the community vector     death loop removes excess individuals   Return results to R, which processes and saves the objects   Repeat

A typical run has 20 cycles, each with 500 time steps, and takes about an hour. The immigration and birth loops are independent of each other, and so could run simultaneously. They both add to the recruit vector, but the order of the addition doesn't matter so long as both finish before the recruit vector is added to the community vector. The immigration, birth, and death loop iterate over arrays in a way that the outcome at different locations is independent. i.e., the impact of the birth vector on recruit vector position 0 has no influence on what the birth vector does to recruit vector position 1.

What I'm thinking of doing is running the birth and immigration loops as separate threads, and possibly running each of those threads as a group of threads - so a thread for a birth loop that iterates over the first N positions, another thread for the second N positions and so on.

I'm keen to learn about parallel programming, but I don't understand enough yet to make sense of the information in the R extensions manual and the various discussions on this list about R being thread-safe. Does it matter if R is thread-safe if the actual simulation is being computed in separate, shared C code?

I'm running my current, sequential code, on a cluster that supports both OpenMP and MPI, should I figure out how to use it.

Thanks for your patience,


Don't learn the tricks of the trade; learn the trade.

R-devel_at_r-project.org mailing list
Received on Fri 18 Apr 2008 - 16:19:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 18 Apr 2008 - 19:31:11 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive