Re: [R] For->lapply->parallel apply

From: Steve Lianoglou <mailinglist.honeypot_at_gmail.com>
Date: Sun, 10 Apr 2011 20:15:36 -0400

Hi,

On Sat, Apr 9, 2011 at 5:03 AM, Alaios <alaios_at_yahoo.com> wrote:
> Dear all,
> I would like to ask your help understand the subsequent steps for making my program faster.
>
> The following code:
> Gauslist<-array(data=NA,dim=c(dimx,dimy,dimz))
> for (i in c(1:dimz)){
>    print(sprintf('Creating the %d map',i));
>    Gauslist[,,i]<-f <- GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha))
> }
>
>
> creates 100 GaussMaps (each map is of 256*256 dim) and stores them in a matrix called Gauslist.
>
> This process takes too long, so I was thinking if you can help me understand what should I do to make it run in parallel (in work there is a system with 16 cores).
>
> There is mclapply (parralel version of lapply) . If I make run my code run with lapply then I will be able to run it with mclapply also (they have same syntax).
> If I understand it correct the sequence for doing that is to understand the following:
>
> for..loop->lapply->mcapply
>
> Can you please help me understand if my for loop can be converted to lapply or not?

Your loop can be converted quite easily.

The lapply function simply takes an object to iterate over as its first argument (this can be a list of things, a vector of things, etc.) and a function to apply to each element in the iteration. `lapply` will build a list of results that your function returns for each element.

A simple example is to iterate over the words in a character vector and return how many characters are in each word.

R> words <- c('cat', 'dog's, 'people')
R> sizes <- lapply(words, function(x) nchar(x))
R> sizes

[[1]]
[1] 3

[[2]]
[1] 4

[[3]]
[1] 6

So in your example:

> for (i in c(1:dimz)){
>    print(sprintf('Creating the %d map',i));
>    Gauslist[,,i]<-f <- GaussRF(x=x, y=y, model=model,
> grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha))
> }

Could be something like:

gauslist <- lapply(1:dimz, function(i) {   GaussRF(x=x, y=y, model=model, ... WHATEVER ELSE) })

using mclapply would be exactly the same, except replace lapply with mclapply.

Actually, is it correct that you aren't doing anything different in the iterations of the for loop -- I mean, nothing in your code really depends on your value for `i`, right?

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 11 Apr 2011 - 00:20:07 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Apr 2011 - 01:40:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive