Re: [Rd] R scripts slowing down after repeated called to compiled code

From: Michael Braun <braunm_at_MIT.EDU>
Date: Sat, 26 May 2007 09:16:22 -0400

Oleg:

No, I'm not using any temp files. The only external library I use is the GSL library, and I have counted, and re-counted, my gsl_matrix(vector)_alloc and gsl_matrix(vector)_free statements to be sure that they balance.

I cases when they weren't balanced, memory usage would increase very rapidly. That does not seem to be happening here.

What about setting the minimum size of the vector heap very large (say, 4GB?). Might that help? I really don't understand how that works, or what the output of the gc() statement means, to help me diagnose the problem.

Thanks,

Michael

-----Original Message-----
From: Oleg Sklyar [mailto:osklyar_at_ebi.ac.uk] Sent: Saturday, May 26, 2007 7:42 AM
To: braunm_at_MIT.EDU
Cc: r-devel_at_r-project.org
Subject: Re: [Rd] R scripts slowing down after repeated called to compiled code

I work with images with a lot of processing done in C code. Quite often I allocate memory there up to several gigs in chunks of 10-15 Mb each plus hundreds of protected dims, names etc. I had a similar problem only once when due to some erroneous use of an external library, internally created objects were not freed correctly. Otherwise, after correcting this, I never have seen any slow down on large number of objects created and manipulated. And then, it was so difficult to track the memory leak that I would really suggest to double and triple check all the memory allocations. Your code does not use any temp files? This could be a real pain. Oleg

Dirk Eddelbuettel wrote:
> On 25 May 2007 at 19:12, Michael Braun wrote:
> | So I'm stuck. Can anyone help?
>
> It sounds like a memory issue. Your memory may just get fragmented.
> One tool that may help you find leaks is valgrind -- see the 'R
> Extensions' manual. I can also recommend the visualisers like kcachegrind
(part of KDE).
>
> But it may not be a leak. I found that R just doesn't cope well with
> many large memory allocations and releases -- I often loop over data
> request that I subset and process. This drives my 'peak' memory use to
> 1.5 or 1.7gb on 32bit/multicore machine with 4gb, 6gb or 8gb (but
> 32bit leading to the hard 3gb per process limit) . And I just can't
> loop over many such task. So I now use the littler frontend to script
> this, dump the processed chunks as Rdata files and later re-read the
pieces. That works reliably.
>
> So one think you could try is to dump your data in 'gsl ready' format
> from R, quit R, leave it out of the equation and then see if what
> happens if you do the iterations in only GSL and your code.
>
> Hth, Dirk
>

--
Dr Oleg Sklyar | EBI-EMBL, Cambridge CB10 1SD, UK | +44-1223-494466

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat 26 May 2007 - 13:19:51 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 27 May 2007 - 05:33:53 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.