[Rd] Bug in mclapply?

From: Winston Chang <winstonchang1_at_gmail.com>
Date: Mon, 10 Dec 2012 22:30:19 -0600


I've been using mclapply and have encountered situations where it gives errors or returns incorrect results. Here's a minimal example, which gives the error on R 2.15.2 on Mac and Linux:

library(parallel)
f <- function(x) NULL
mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1)
# Error in sum(sapply(res, inherits, "try-error")) :
# invalid 'type' (list) of argument

I believe it happens when the following are true:

- The function returns NULL
- mc.preschedule = FALSE
- mc.cores >= length of the input data


Here are some examples I used to trace down the problem.

library(parallel)
f <- function(x) NULL

# Error when mc.preschedule=FALSE and mc.cores >= length(x)

mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1)    # Error
mclapply(1, f, mc.preschedule = FALSE, mc.cores = 2)    # Error
mclapply(1:2, f, mc.preschedule = FALSE, mc.cores = 1)  # OK

# In the following 2 cases, I get an error about 10-20% of the time.
# The other times, the result is worse: it returns a list with only one
# element, not two!

mclapply(1:2, f, mc.preschedule = FALSE, mc.cores = 2) # Error mclapply(1:2, f, mc.preschedule = FALSE, mc.cores = 3) # Error

# When mc.preschedule=TRUE, always works

mclapply(1, f, mc.preschedule = TRUE, mc.cores = 1)    # OK
mclapply(1:2, f, mc.preschedule = TRUE, mc.cores = 1)  # OK
mclapply(1:2, f, mc.preschedule = TRUE, mc.cores = 2)  # OK

# lapply() always works

lapply(1, f)    # OK
lapply(1:2, f)  # OK
lapply(1:2, f)  # OK


# If function returns non-null, it works
g <- function(x) 0

mclapply(1, g, mc.preschedule = FALSE, mc.cores = 1)    # OK
mclapply(1:2, g, mc.preschedule = FALSE, mc.cores = 1)  # OK
mclapply(1:2, g, mc.preschedule = FALSE, mc.cores = 2)  # OK



Digging around in mclapply(), I think it happens because mccollect(jobs) is returning an empty list. But when I use options(error=recover) and debug the function, I find that when I call mccollect(jobs) again, it returns a list with values -- it's as though mccollect() is returning too early. This will illustrate:

> mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1) Error in sum(sapply(res, inherits, "try-error")) :   invalid 'type' (list) of argument

Enter a frame number, or 0 to exit

  1. mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1)

Selection: 1
Called from: top level
Browse[1]> res
named list()
Browse[1]> res <- mccollect(jobs)
Browse[1]> res
$`12348`
NULL The error happens on line 63 of mclapply.r, which is after `res <- mccollect(jobs)` is called, on line 61. At this point, res should be a named list with values filled in, but it's empty. When I run `res <- mccollect(jobs)` again, it gives the correct values.

Is there a good way to work around this issue for now?

-Winston

        [[alternative HTML version deleted]]



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 11 Dec 2012 - 04:35:16 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 11 Dec 2012 - 17:32:46 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive