Re: [Rd] arbitrary size data frame or other stcucts, curious about issues invovled.

From: Jay Emerson <jayemerson_at_gmail.com>
Date: Mon, 20 Jun 2011 15:12:56 -0400


Mike,

Neither bigmemory nor ff are "drop in" solutions -- though useful, they are primarily for data storage and management and allowing convenient access to subsets of the data. Direct analysis of the full objects via most R functions is not possible. There are many issues that could be discussed here (and have, previously), including the use of 32-bit integer indexing. There is a nice section "Future Directions" in the R Internals manual that you might want to look at.

Jay

We keep getting questions on r-help about memory limits and I was curious to know what issues are involved in making common classes like dataframe work with disk and intelligent swapping? That is, sure you can always rely on OS for VM but in theory it should be possible to make a data structure that somehow knows what pieces you will access next and can keep thos somewhere fast. Now of course algorithms "should" act locally and be block oriented but in any case could communicate with data structures on upcoming access patterns, see a few ms into the future and have the right stuff prefetched.

I think things like "bigmemory" exist but perhaps one issue was that this could not just drop in for data.frame or does it already solve all the problems?

Is memory management just a non-issue or is there something that needs to be done to make large data structures work well?

-- 
John W. Emerson (Jay)
Associate Professor of Statistics
Department of Statistics
Yale University
http://www.stat.yale.edu/~jay

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Mon 20 Jun 2011 - 19:15:29 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 21 Jun 2011 - 11:40:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive