[Rd] Bug in parallel / mclapply

From: Bernd Bischl <bernd_bischl_at_gmx.net>
Date: Fri, 16 Nov 2012 23:48:13 +0100


Hi,

there seem to be some (small) bugs in the mclapply function in parallel. I discovered this in the current R release version, and I checked that it is still present in R-devel.

I think it only occurs in the part of the code corresponding to argument option mc.preschedule = FALSE.

Here are two examples:

a)
library(parallel)
mclapply(list(), identity, mc.preschedule=FALSE) Error in sum(sapply(res, inherits, "try-error")) :

   invalid 'type' (list) of argument

Possible reason / fixes:

The relevant portion of the code is this:

if (!mc.preschedule) {

   FUN <- match.fun(FUN)
   if (length(X) <= cores) {

       jobs <- lapply(seq_along(X), function(i) mcparallel(FUN(X[[i]],
           ...), name = names(X)[i], mc.set.seed = mc.set.seed,
           silent = mc.silent))
       res <- mccollect(jobs)
       if (length(res) == length(X))
           names(res) <- names(X)
       has.errors <- sum(sapply(res, inherits, "try-error"))
   }

If there are 0 jobs, mccollect returns NULL, sapply on this returns the empty list, leads to an error in sum.
Possible fix might be to use vapply with FUN.VALUE instead.

b)

library(parallel)
foo <- function(i) NULL
mclapply(1, foo, mc.preschedule=FALSE)

Error in sum(sapply(res, inherits, "try-error")) :

   invalid 'type' (list) of argument

mclapply(1:2, foo, mc.preschedule=FALSE) $`17222`
NULL # should be list(NULL, NULL)

Possible reason / fixes:

The bug here is more complicated I think, and occurs in mccollect.

By using debug, I ended up in this line in mccollect:

res[[which(pid == pids)]] <- unserialize(r)

Here, the result of foo (=NULL) is not entered in the result list "res", because "<- NULL" is sementically interpreted as "delete list element". (which has btw trapped me quite often, too...) Therefore mccollect does not return a list of NULLs as it should, but only list(), which leads to the same error as in a). In the case mclapply(1:2, foo, mc.preschedule=FALSE) something even more complicated happens, as on.exit we jump into a "cleanup" function which again calls mccollect. Anyway, the final result is wrong, too.

Possible fix:
Actually I am not sure what the best way here is, except for an extra if-case to avoid this when the assigment happens?

Best,

Bernd



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 16 Nov 2012 - 22:51:46 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 18 Nov 2012 - 14:20:50 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive