Re: [R] compress data on read, decompress on write

From: Christos Hatzis <christos.hatzis_at_nuverabio.com>
Date: Thu, 28 Feb 2008 13:49:04 -0500

Ramon,

If you are looking for a solution to your specific application (as opposed to a general compression/ decompression mechanism), it might be worth checking out the Matrix package, which has facilities for storing and manipulating sparse matrices. The sparseMatrix class stores matrices in the triplet representation (i.e. only indices and values of the non-zero elements) and this affords great compression ratios, depending on the size and degree of sparseness of the matrix.

-Christos

> -----Original Message-----
> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of Ramon Diaz-Uriarte
> Sent: Thursday, February 28, 2008 1:18 PM
> To: r-help_at_stat.math.ethz.ch
> Subject: [R] compress data on read, decompress on write
>
> Dear All,
>
> I'd like to be able to have R store (in a list component) a
> compressed data set, and then write it out uncompressed.
> gzcon and gzfile work in exactly the opposite direction. What
> would be a good way to handle this?
>
> Details:
> ----------
>
> We have a package that uses C; part of the C output is a
> large sparse matrix. This is never manipulated directly by R,
> but always by the C code. However, we need to store that data
> somewhere (inside an R
> object) for further calls to the functions in our package.
> We'd like to store that matrix as part of the R object (say,
> as an element of a list). Ideally, it would be stored in as
> compressed a way as possible.
> Then, when we need to use that information, it would be
> decompressed and passed to the C function.
>
> I guess one way to do it is to have C deal with the
> compression and uncompression (e.g., using zlib or the bzip2

> libraries) and then use readBin, etc, from R. But, if I can,
> I'd like to avoid our C code having to call zlib, etc, so as
> to make our package easily portable.
>
>
> Thanks,
>
> R.
>
> --
> Ramon Diaz-Uriarte
> Statistical Computing Team
> Structural Biology and Biocomputing Programme Spanish
> National Cancer Centre (CNIO) http://ligarto.org/rdiaz
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 28 Feb 2008 - 18:48:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 28 Feb 2008 - 23:30:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive