Re: [R] Reasons to Use R

From: Johann Hibschman <johannh_at_gmail.com>
Date: Sun 08 Apr 2007 - 19:07:25 GMT

On 4/6/07, Wilfred Zegwaard <wilfred.zegwaard@gmail.com> wrote:

> I'm not a programmer, but I have the experience that R is good for
> processing large datasets, especially in combination with specialised
> statistics.

This I find a little surprising, but maybe it's just a sign that I'm not experienced enough with R yet.

I can't use R for big datasets. At all. Big datasets take forever to load with read.table, R frequently runs out of memory, and nlm or gnlm never seem to actually converge to answers. By comparison, I can point SAS and NLIN at this data without problem. (Of course, SAS is running on a pretty powerful dedicated machine with a big ram disk, so that may be part of the problem.)

R's pass-by-value semantics also make it harder than it should be to deal with where it's crucial that you not make a copy of the data frame, for fear of running out of memory. Pass-by-reference would make implementing data transformations so much easier that I don't really understand how pass-by-value became the standard. (If there's a trick to doing in-place transformations, I've not found it.)

Right now, I'm considering starting on a project involving some big Monte Carlo integrations over the complicated posterior parameter distributions of a nonlinear regression model, and I have the strong feeling that R will just choke.

R's great for small projects, but as soon as you even a few hundred megs of data, it seems to break down.

If I'm doing things wrong, please tell me. :-) SAS is a beast to work with.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon Apr 09 05:12:31 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 09 Apr 2007 - 02:30:55 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.