Re: [Rd] R CMD build --resave-data

From: Hervé Pagès <hpages_at_fhcrc.org>
Date: Tue, 12 Apr 2011 17:53:32 -0700

Hi Uwe,

On 11-04-11 08:13 AM, Uwe Ligges wrote:
>
>
> On 11.04.2011 02:47, Hervé Pagès wrote:
>> Hi,
>>
>> More about the new --resave-data option
>>
>> As mentioned previously here
>>
>> https://stat.ethz.ch/pipermail/r-devel/2011-April/060511.html
>>
>> 'R CMD build' and 'R CMD INSTALL' handle this new option
>> inconsistently. The former does --resave-data="gzip" by default.
>> The latter doesn't seem to support the --resave-data= syntax:
>> the --resave-data flag must either be present or not. And by
>> default 'R CMD INSTALL' won't resave the data.
>>
>> Also, because now 'R CMD build' is resaving the data, shouldn't it
>> reinstall the package in order to be able to do this correctly?
>>
>> Here is why. There is this new warning in 'R CMD check' that complains
>> about files not of a type allowed in a 'data' directory:
>>
>>
>> http://bioconductor.org/checkResults/2.8/bioc-LATEST/Icens/lamb1-checksrc.html
>>
>>
>>
>> The Icens package also has .R files under data/ with things like:
>>
>> bet <- matrix(scan("CMVdata", quiet=TRUE),nc=5,byr=TRUE)
>>
>> i.e. the R code needs to access some of the text files located
>> in the data/ folder. So in order to get rid of this warning I
>> tried to move those text files to inst/extdata/ and I modified
>> the code in the .R file so it does:
>>
>> CMVdata_filepath <- system.file("extdata", "CMVdata", package="Icens")
>> bet <- matrix(scan(CMVdata_filepath, quiet=TRUE),nc=5,byr=TRUE)
>>
>> But now 'R CMD build' fails to resave the data because the package
>> was not installed first and the CMVdata file could not be found.
>>
>> Unfortunately, for a lot of people that means that the safe way to
>> build a source tarball now is with
>>
>> R CMD build --keep-empty-dirs --no-resave-data
>
>
> Hervé,
>
> actually is makes some sense to have these defaults from a CRAN
> maintainer's point of view:
>
> --keep-empty-dirs:
> we found many packages containing empty dirs unnecessarily and the idea
> is to exclude them at the build state rather than at the later
> installation stage. Note that the package maintainer is supposed to run
> build (and knows if the empty dirs are to be included, the user who runs
> INSTALL does not).
>
> --no-resave-data:
> We found many packages with unsufficiently compressed data. This should
> be fixed when building the package, not later when installing it, since
> the reduces size is useful in the source tarball already.
>
> So it does make some sense to have different defaults in build as
> opposed to INSTALL from my point of view (although I could live with
> different, tough).

If you deliberately ignore the fact that 'R CMD INSTALL' is also used by developers to install from the *package source tree* (by opposition to end users who use it to install from a *source tarball*, even though they don't use it directly), then you have a point. So maybe I should have been more explicit about the problem that it can be for the *developer* to have 'R CMD build' and 'R CMD INSTALL' behave differently by default.

Of course I'm not suggesting that 'R CMD INSTALL' should behave differently (by default) depending on whether it's used on a source tarball (mode 1) or a package source tree (mode 2).

I'm suggesting that, by default, the 3 commands (R CMD build + R CMD INSTALL in mode 1 and 2) behave consistently.

With the latest changes, and by default, 'R CMD INSTALL' is still doing the right thing, but not 'R CMD build' anymore.

I perfectly understand the intention behind those new flags, which is to try to "optimize" the resulting source tarball but what would you think if 'gcc' had some optimization flags that can generate broken executables (under some circumstances) and if these flags were enabled by default?

Note that I would have no problem with 'R CMD build' trying to resave the data by default if the current implementation of that feature was working properly, but unfortunately it's broken (see my previous email for the details).

Thanks,
H.

>
> If you need further arguments for the discussion: I also tend to use
> --no-vignettes nowadays if my code does not change considerably. ;-)
>
> Best wishes,
> Uwe
>
>
>
>> I hope the list of options/flags that we need to use to "fix" 'R CMD
>> build' (and make it consistent with R CMD INSTALL) is not going to
>> grow too much ;-)
>>
>> Thanks,
>> H.
>>
>>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages_at_fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed 13 Apr 2011 - 00:56:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 13 Apr 2011 - 18:10:46 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive