Re: [Rd] idea for "virtual matrix/array" class

From: Barry Rowlingson <B.Rowlingson_at_lancaster.ac.uk>
Date: Tue 24 Aug 2004 - 18:54:12 EST

Thomas Lumley wrote:
> On Mon, 23 Aug 2004, Tony Plate wrote:
>

>>One idea I was thinking about was to have a new class of object that
>>referred to data in a file on disk, and which had all the standard methods
>>of matrices and arrays, i.e., subsetting ("["), dim, dimnames, etc. 

>
> This is what RPgSql does with proxy dataframes and what I did (read-only)
> for netCDF access. It's a good idea if you have a data format for which
> random access is fairly fast. I'm not sure that the standard serialized
> binary format satisfies this. Fixed-format text files would work, but
> free-format ones wouldn't -- seek() only helps when you can work out where
> to seek without reading all the data.

  Just to join in on the 'done it' threads here, this is what my Rmap package does with DBF files (they are the database component of ESRI Shapefile maps). I use the dbf library from shapelib to access a DBF file just like a data frame.

  My dbf objects keep track of selected rows and columns, from the database file, so its possible to do:

  db1 = db[1:10,]

  and db1 is still a proxy object to the same DBF file as db, but with attributes that tell it that it only has rows 1 to 10 in it. If you really want a data frame, you just as.data.frame() it.

  If you wanted to do this sort of thing for space-saving reasons you'd have to be very careful, since for some operations R might slurp it all into memory.

Baz

http://www.maths.lancs.ac.uk/Software/Rmap/



R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue Aug 24 18:56:51 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:45:08 EST