Re: [Rd] Compressing data for package builds

From: Simon Urbanek <>
Date: Thu, 16 Aug 2012 20:48:16 -0400

On Aug 16, 2012, at 5:08 PM, steven mosher wrote:

> Hi,
> I have two .rda files that I need to include in a package. I've placed
> them both in a data directory
> after save() the are around 150Kb each.
> When I try to check the package I get the following warning
> Warning: large data file(s) saved inefficiently:
> size ASCII compress
> zagoskin.rda 137Kb FALSE none
> Note: significantly better compression could be obtained
> by using R CMD build --resave-data
> old_size new_size compress
> modpoll.rda 124Kb 78Kb xz
> zagoskin.rda 137Kb 6Kb bzip2
> Both of these files modpoll.rda and zagoskin.rda have already been
> compressed from megabytes down to Kb.
> Also,, the instructions "R CMD build --resave-data" doesnt do anything
> that I can see so I must be using it wrong.

R CMD build is how you preferably should be creating your package tar ball, so you simply add the --resave-data argument to your already existing R CMD build call which creates the tar ball from your source directory. So can you elaborate on "doesn't do anything I can see"? In what sense? No output? No compression?


> Is there a piece of the puzzle I am missing or instructions better than
> these: I tried LazyDataCompression and my
> data.rdb is 90Kb.
> "Package *tools* has a couple of functions to help with data images:
> checkRdaFiles reports on the way the image was saved, and resaveRdaFiles will
> re-save with a different type of compression, including choosing the best
> type for that particular image.
> Some packages using ŚLazyData‚ will benefit from using a form of
> compression other than gzip in the installed lazy-loading database. This
> can be selected by the --data-compress option to R CMD INSTALL or by using
> the ŚLazyDataCompression‚ field in the DESCRIPTION file. Useful values are
> bzip2, xz and the default, gzip. The only way to discover which is best is
> to try them all and look at the size of the pkgname/data/Rdata.rdb file."
> [[alternative HTML version deleted]]
> ______________________________________________
> mailing list
> mailing list Received on Fri 17 Aug 2012 - 00:50:28 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 17 Aug 2012 - 10:20:40 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive