Re: [Rd] data frame subset patch, take 2

From: Marcus G. Daniels <mgd_at_santafe.edu>
Date: Tue 12 Dec 2006 - 17:32:14 GMT

Hi Martin,

Conventions for optimizing away long, useless row name vector sound very useful. Nice timings too!
I've noticed that before, and not been sure quite what to do. e.g. the hdf5 module just gives up past a certain threshold as the long vectors cause performance problems and HDF5 doesn't allow giant attributes anyway. The common case for me, is no row names except numbers.
> Note however that some of these changes are backward
> incompatible. I do hope that the changes gaining efficiency
> for such large data frames are worth some adaption of
> current/old R source code..
>
On numerous occasions I've used 64 bit Altix systems, e.g. having a terabyte of RAM, for loading and preprocessing data, just so I can zip around in the image once it is done (either on that system or another). R works great for big datasets, even though it has a few of these rough edges..

Marcus



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Dec 13 23:42:02 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 13 Dec 2006 - 19:32:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.