Re: [Rd] Compressing data for package builds

From: steven mosher <moshersteven_at_gmail.com>
Date: Thu, 16 Aug 2012 22:24:16 -0700

" R CMD build is how you preferably should be creating your package tar ball, so you simply add the --resave-data argument to your already existing R CMD build call which creates the tar ball from your source directory. So can you elaborate on "doesn't do anything I can see"? In what sense? No output? No compression? "

my tarball builds with > R CDM build mattools

where mattools is the name of the package. and I get a warning on R CMD check.

Things I tried

R CMD build --resave-data
R CMD build mattools --resave-data
R CMD build --resave-data mattools

The first does nothing, the second fails on unknown options and the third fails on unknown options. So I found the help for R CMD

Now that I figured out how to display help for R CMD build I see that

--resave-data must include a specification of the type of compression

--resave-data="best" for example

I ran that. and got the same error indicating that the rda file had not been compressed.

 checking data for non-ASCII characters ... OK * checking data for ASCII and uncompressed saves ... WARNING   Warning: large data file(s) saved inefficiently:

                size ASCII compress
  zagoskin.rda 137Kb FALSE     none

  Note: significantly better compression could be obtained
        by using R CMD build --resave-data
               old_size new_size compress
  modpoll.rda     124Kb     78Kb       xz
  zagoskin.rda    137Kb      6Kb    bzip2

Building under windows so I wonder if I am missing a system file required to do the compression.

On Thu, Aug 16, 2012 at 5:48 PM, Simon Urbanek <simon.urbanek_at_r-project.org>wrote:

>
> On Aug 16, 2012, at 5:08 PM, steven mosher wrote:
>
> > Hi,
> >
> > I have two .rda files that I need to include in a package. I've placed
> > them both in a data directory
> > after save() the are around 150Kb each.
> >
> > When I try to check the package I get the following warning
> >
> > Warning: large data file(s) saved inefficiently:
> > size ASCII compress
> > zagoskin.rda 137Kb FALSE none
> >
> > Note: significantly better compression could be obtained
> > by using R CMD build --resave-data
> > old_size new_size compress
> > modpoll.rda 124Kb 78Kb xz
> > zagoskin.rda 137Kb 6Kb bzip2
> >
> > Both of these files modpoll.rda and zagoskin.rda have already been
> > compressed from megabytes down to Kb.
> >
> > Also,, the instructions "R CMD build --resave-data" doesnt do
> anything
> > that I can see so I must be using it wrong.
>
> R CMD build is how you preferably should be creating your package tar
> ball, so you simply add the --resave-data argument to your already existing
> R CMD build call which creates the tar ball from your source directory. So
> can you elaborate on "doesn't do anything I can see"? In what sense? No
> output? No compression?
>
> Cheers,
> Simon
>
>
> > Is there a piece of the puzzle I am missing or instructions better than
> > these: I tried LazyDataCompression and my
> > data.rdb is 90Kb.
> >
> > "Package *tools* has a couple of functions to help with data images:
> > checkRdaFiles reports on the way the image was saved, and resaveRdaFiles
> will
> > re-save with a different type of compression, including choosing the best
> > type for that particular image.
> >
> > Some packages using ŚLazyData‚ will benefit from using a form of
> > compression other than gzip in the installed lazy-loading database. This
> > can be selected by the --data-compress option to R CMD INSTALL or by
> using
> > the ŚLazyDataCompression‚ field in the DESCRIPTION file. Useful values
> are
> > bzip2, xz and the default, gzip. The only way to discover which is best
> is
> > to try them all and look at the size of the pkgname/data/Rdata.rdb file."
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

        [[alternative HTML version deleted]]



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 17 Aug 2012 - 05:33:59 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 17 Aug 2012 - 10:20:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive