Re: [Rd] slightly speeding up readChar()

From: Jeffrey Ryan <jeffrey.ryan_at_lemnica.com>
Date: Fri, 05 Aug 2011 09:37:24 -0500

Michael,

The mmap package currently provides an Ops method for comparisons, which returns something like which(i==) does in R - since a vector of logicals the same size would be likely too big to handle in many cases.

At some point I'll implement mmap to mmap operations, though for vectorized ops that result in non-logical output (i.e. numeric), I haven't yet decided on how that should be implemented. Something like a results buffer on disk/memory has been my thinking, but anyone with additional (better!) suggestions please feel free to send me ideas off list.

I'll look to add some basic summary statistics as well.

Note that you need to have a binary representation on disk (via fwrite in C, or writeBin or as.mmap in R) for this to work. But the package currently supports something like 16 data types, including bit logicals, fixed width character strings (\0 delim vectors), floats (4 byte), and 64 bit ints. The vignette covers a lot of the details.

Additionally if you have struct-style data (think row-oriented, with varying types), you can use the struct() feature. This maps to an R list, but allows for very fast access if you are pulling complete rows.

example(mmap)
example(types)
example(struct)

The R-forge version has more than the CRAN version at this moment, but I'll be pushing a new one to CRAN soon.

Jeff

On Fri, Aug 5, 2011 at 8:22 AM, Michael Lachmann <lachmann_at_eva.mpg.de>wrote:

>
> On 5 Aug 2011, at 1:20AM, Dirk Eddelbuettel wrote:
>
> > When you know the (fixed) structure of the data, the CRAN package mmap
> can be
> > a huge winner.
>
> Thanks! I didn't know that.
>
> Is there a package that provides methods for mmap, like sum(x) or maybe
> even y=x+z
> where x, and z are mmaps?
>
> I assume that once you mmap to a huge file, you do operations on it by
> working on chunks at a time... are there packages for that, or do I have to
> write my own code?
>
> Thanks!
>
> Michael
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Jeffrey Ryan
jeffrey.ryan_at_lemnica.com

www.lemnica.com
www.esotericR.com

	[[alternative HTML version deleted]]

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Fri 05 Aug 2011 - 14:54:34 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 05 Aug 2011 - 18:40:15 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive