Re: [Rd] Rmpi_0.5-4 and OpenMPI questions

From: Luke Tierney <luke_at_stat.uiowa.edu>
Date: Thu, 4 Oct 2007 06:37:24 -0500 (CDT)

On Thu, 4 Oct 2007, Dirk Eddelbuettel wrote:

>
> On 4 October 2007 at 01:11, Hao Yu wrote:
> | Hi Dirk,
> |
> | Thank for pointing out additional flags needed in order to compile Rmpi
> | correctly. Those flags can be added in configure.ac once openmpi dir is
> | detected. BTW -DMPI2 flag was missed in your Rmpi since the detection of
> | openmpi was not good. It should be
> | ####
> | if test -d ${MPI_ROOT}/lib/openmpi; then
> | echo "Found openmpi dir in ${MPI_ROOT}/lib"
> | MPI_DEPS="-DMPI2"
> | fi
> | ####
>
> I don't follow. From my build log:
>
> * Installing *source* package 'Rmpi' ...
> [...]
> checking for gcc option to accept ISO C89... none needed
> I am here /usr
> Try to find mpi.h ...
> Found in /usr/include
> Try to find libmpi or libmpich ...
> Found libmpi in /usr/lib
> Found openmpi dir in /usr/lib <---------- found openmpi
> [...]
> ** libs
> make[1]: Entering directory `/tmp/buildd/rmpi-0.5-4/src'
> gcc-4.2 -std=gnu99 -I/usr/share/R/include -I/usr/share/R/include -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -I/usr/include -DMPI2 -fPIC -fpic -g -O2 -c RegQuery.c -o RegQuery.o
> [...]
>
> so -DMPI2 is used.
>
> Because I build this in a chroot / pbuilder envinronment, neither LAM nor
> MPICH2 are installed and Open MPI is detected.
>
> | I tried to run Rmpi under snow and got the same error messenger. But after
> | checking makeMPIcluster, I found that n=3 was a wrong argument. After
> | makeMPIcluster finds that count is missing,
>
> Yes, my bad. But it also hangs with argument count=3 (which I had tried, but
> my mail was wrong.)

Any chance the snow workers are picking up another version of Rmpi, eg a LAM one? Might happen if you have R_SNOW_LIB set and a Rmpi installed there. Otherwise starting with outfile=something may help. Let me know what you find out -- I'd like to make the snow configuration process more bullet-proof.

>
> | count=mpi.comm.size(0)-1 is used. If you start R alone, this will return
> | count=0 since there is only one member (master). I do not know why snow
> | did not use count=mpi.universe.size()-1 to find total nodes available.
>
> How would it know total nodes ? See below re hostfile.
>
> | Anyway after using
> | cl=makeMPIcluster(count=3),
> | I was able to run parApply function.
> |
> | I tried
> | R -> library(Rmpi) -> library(snow) -> c1=makeMPIcluster(3)
> |
> | Also
> | mpirun -host hostfile -np 1 R --no-save
> | library(Rmpi) -> library(snow) -> c1=makeMPIcluster(3)
> |
> | Hao
> |
> | PS: hostfile contains all nodes info so in R mpi.universe.size() returns
> | right number and will spawn to remote nodes.
>
> So we depend on a correct hostfile ? As I understand the Open MPI this is
> deprecated:
>
> # This is the default hostfile for Open MPI. Notice that it does not
> # contain any hosts (not even localhost). This file should only
> # contain hosts if a system administrator wants users to always have
> # the same set of default hosts, and is not using a batch scheduler
> # (such as SLURM, PBS, etc.).
>
> I am _very_ interested in running Open MPI and Rmpi under slurm (which we
> added to Debian as source package slurm-llnl) so it would be nice if this
> could rewritten to not require a hostfile as this seems to be how upstream is
> going.

To work better with batch scheduling environments where spawning might be techncally or politically problematic I have been trying to improve the RMPISNOW script that can be used with LAM as

     mpirun -np 3 RMPISNOW

and then either

     cl <- makeCluster() # no argument

or

     cl <- makeCluster(2) # mpi rank - 1 (or less I believe)

(the default type for makeCluster becomes MPI in this case). This seems to work reasonably well in LAM and I think I can get it to work similarly in OpenMPI -- will try in the next day or so. Both LAM and OpenMPI provide environment variables so shell scripts can determine the mpirank, which is useful for getting --slave and output redirect to the workers. I haven't figured out anything analogous for MPIC/MPICH2 yet.

Best,

luke

>
> | Rmp under Debian 3.1 and openmpi 1.2.4 seems OK. I did find some missing
> | lib under Debian 4.0.
>
> Can you be more specifi? I'd be glad to help.
>
> Thanks!
>
> Dirk
>
>
> |
> |
> | Dirk Eddelbuettel wrote:
> | >
> | > Many thanks to Dr Yu for updating Rmpi for R 2.6.0, and for starting to
> | > make
> | > the changes to support Open MPI.
> | >
> | > I have just built the updated Debian package of Rmpi (i.e. r-cran-rmpi)
> | > under
> | > R 2.6.0 but I cannot convince myself yet whether it works or not. Simple
> | > tests work. E.g. on my Debian testing box, with Rmpi installed directly
> | > using Open Mpi 1.2.3-2 (from Debian) and using 'r' from littler:
> | >
> | > edd_at_ron:~> orterun -np 3 r -e 'library(Rmpi); print(mpi.comm.rank(0))'
> | > [1] 0
> | > [1] 1
> | > [1] 2
> | > edd_at_ron:~>
> | >
> | > but I basically cannot get anything more complicated to work yet. R /
> | > Rmpi
> | > just seem to hang, in particular snow and and getMPIcluster() just sit
> | > there:
> | >
> | >> cl <- makeSOCKcluster(c("localhost", "localhost"))
> | >> stopCluster(cl)
> | >> library(Rmpi)
> | >> cl <- makeMPIcluster(n=3)
> | > Error in makeMPIcluster(n = 3) : no nodes available.
> | >>
> | >
> | > I may be overlooking something simple here, in particular the launching of
> | > apps appears to be different for Open MPI than it was with LAM/MPI (or
> | > maybe
> | > I am just confused because I also look at LLNL's slurm for use with Open
> | > MPI ?)
> | >
> | > Has anybody gotten Open MPI and Rmpi to work on simple demos? Similarly,
> | > is
> | > anybody using snow with Rmpi and Open MPI yet?
> | >
> | > Also, the Open MPI FAQ is pretty clear on their preference for using mpicc
> | > for compiling/linking to keep control of the compiler and linker options
> | > and
> | > switches. Note that e.g. on my Debian system
> | >
> | > edd_at_ron:~> mpicc --showme:link
> | > -pthread -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl
> | > -lutil -lm -ldl
> | >
> | > whereas Rmpi built with just the default from R CMD:
> | >
> | > gcc-4.2 -std=gnu99 -shared -o Rmpi.so RegQuery.o Rmpi.o conversion.o
> | > internal.o -L/usr/lib -lmpi -lpthread -fPIC -L/usr/lib/R/lib -lR
> | >
> | > Don't we need libopen-rte and libopen-pal as the MPI FAQ suggests?
> | >
> | > Many thanks, Dirk
> | >
> | > --
> | > Three out of two people have difficulties with fractions.
> | >
> |
> |
> | --
> | Department of Statistics & Actuarial Sciences
> | Fax Phone#:(519)-661-3813
> | The University of Western Ontario
> | Office Phone#:(519)-661-3622
> | London, Ontario N6A 5B7
> | http://www.stats.uwo.ca/faculty/yu
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke_at_stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Thu 04 Oct 2007 - 11:46:55 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 25 Oct 2007 - 11:37:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.