Re: [R] simulations with very large number of iterations (1 billion)

From: Viechtbauer Wolfgang (STAT) <wolfgang.viechtbauer_at_maastrichtuniversity.nl>
Date: Fri, 15 Apr 2011 10:41:06 +0200


We do not know the details of the kinds of computations you intend to do within each iteration, but if, let's say, each iterations takes around 1 second, then your simulation will run for the next 30+ years (on a single core). Even if each iteration only takes a fraction of a second, you are still looking at years here. If you can parallelize things, you may be able to make this work within a realistic time frame, but this assumes access to dozens of cores.

Good luck!

Best,

--
Wolfgang Viechtbauer
Department of Psychiatry and Neuropsychology
School for Mental Health and Neuroscience
Maastricht University, P.O. Box 616
6200 MD Maastricht, The Netherlands
Tel: +31 (43) 368-5248
Fax: +31 (43) 368-8689
Web: http://www.wvbauer.com



-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Brian J Mingus
Sent: Friday, April 15, 2011 08:29
To: Marion Dumas
Cc: r-help_at_r-project.org
Subject: Re: [R] simulations with very large number of iterations (1 billion)


On Thu, Apr 14, 2011 at 7:41 PM, Marion Dumas <mariouka_at_gmail.com> wrote:


> Hello R-help list
> I'm trying to run 1 billion iterations of a code with calls to random
> distributions to implement a data generating process and subsequent
> computation of various estimators that are recorded for further
> comparison of performance. I have two question about how to achieve
> this: 1. the most important: on my laptop, R gives me an error message
> saying that it cannot allocate sufficient space for the matrix that is
> meant to record the results (a 1 billion by 4 matrix). Is this
> computer-specific? Are there ways to circumvent this limit? Or is it
> hopeless to run 1 billion iterations in one batch? ( the alternative
> being to run, for example, 1000 iterations of a 1 million iteration
> process that spits out output files that can then be combined
> manually). 2. secondly: when I profile the code on a smaller number of
> iterations, it says that colSums is the function that has the longest
> self time. I am using this to compute stratum-specific treatment
> effects. My thinking was that the fastest way to compute mean outcome
> conditional on treatment for each stratum would be to combine all
> strata in one matrix and apply colSums-type functions on it. Maybe I
> am wrong and there are better ways?
>
> Thank you in advance for any help you may provide.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
The first thing you need to do is estimate the amount of memory that is going to being needed. Then, estimate the amount of time it's going to take. You probably need a 64 bit computer and 4-8 GB of memory at least. You may not want to use R, insteading opting for C code and the GNU Scientific Library. If you can't write C code Lua is pretty easy to learn and GSL has been exposed through it in the GSL Shell: http://www.nongnu.org/gsl-shell/ -- Brian Mingus Graduate student Computational Cognitive Neuroscience Lab University of Colorado at Boulder [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Fri 15 Apr 2011 - 08:43:38 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 15 Apr 2011 - 09:00:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive