Re: [Rd] CRAN policies

From: Paul Gilbert <pgilbert902_at_gmail.com>
Date: Thu, 29 Mar 2012 23:39:02 -0400

On 12-03-29 09:29 PM, Mark.Bravington_at_csiro.au wrote:
> I'm concerned this thread is heading the wrong way, towards
> techno-fixes for imaginary problems. R package-building is already
> encumbered with a huge set of complicated rules, and more
> instructions/rules eg for metadata would make things worse not better.
>
> RCMD CHECK on the 'mvbutils' package generates over 300 Notes about
> "no visible binding...", which inevitably I just ignore. They arise
> because RCMD CHECK is too "stupid" to understand one of my preferred
> coding idioms (I'm not going to explain what-- that's beside the
> point).

Actually, I think that is the point. If your code is generating that many notes then I think you should explain your idiom, so the checks can be made to accommodate it if it really is good. Otherwise, I'd be worried about the quality of your code.

> And RCMD CHECK always will be too "stupid" to understand everything
> that a rich language like R might quite reasonably cause experienced
> coders to do.

Possibly the interpreter is too stupid to understand it too?

> It should not be CRAN's business how I write my code, or even whether
> my code does what it is supposed to. It might be CRAN's business to
> try to work out whether my code breaks CRAN's policies, eg by causing
> R to crash horribly-- that's presumably what Warnings are for (but
> see below). And maybe there could be circumstances where an automatic
> check might be "worried" enough to alert the CRANia and require manual
> explanation and emails etc from a developer, but even that seems
> doomed given the growing deluge of packages.
>
> RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as
> a developer-tool. But the fact that the one programl does both things
> seems accidental to me, and I think this dual-use is muddying the
> discussion. There's a big distinction between (i) code-checks that
> developers themselves might or might not find useful-- which should
> be left to the developer, and will vary from person to person--

I think this a case of two heads are better than one. I did lots of checks before the CRAN checks existed, but the CRAN checks still found bugs in code that I considerer very mature, including bugs in code has been running without noticeable problems for over 15 years. Despite all the noise today, most of us are only talking about a small inconvenience around the intended meaning of "note", not about whether quality control is a bad thing. I've found the errors and warnings are always valid, even though I do not always like having to fix the bugs, and the notes are most often valid too. But there are a few false positives, so the checks that give notes are not yet reliable enough to give warnings or errors. But they should be sometime, so one should usually consider fixing the package code.

> and (ii) code-checks that CRAN enforces for its own peace-of-mind.

I think of this as being for the piece-of-mind of your package users.

> Maybe it's convenient to have both functions in the same place, and
> it'd be fine to use Notes for one and Warnings for the other, but the
> different purposes should surely be kept clear.
>
> Personally, in building over 10 packages (only 2 on CRAN), I haven't
> found RCMD CHECK to be of any use, except for the code-documentation
> and example-running bits. I know other people have different
> opinions, but that's the point: one-size-does-not-fit-all when it
> comes to coding tools.
>
> And wrto the Warnings themselves: I feel compelled to point out that
> it's logically impossible to fully check whether R code will do bad
> things. One has to wonder at what point adding new checks becomes
> futile or counterproductive. There must be over 2000 people who have
> written CRAN packages by now; every extra check and non-back-
> compatible additional requirement runs the risk of generating false-
> negatives and incurring many extra person-hours to "fix"
> non-problems.
> Plus someone needs to document and explain the check (adding to the
> rule mountain), plus there is the time spent in discussions like
> this..!

Bugs in your packages will require users to waste a lot of time too, and possibly reach faulty results with much more serious consequences. Just because perfection may never be attained, this does not mean that progress should not be attempted, in small steps. Compared to Statlib, which basicly followed your recommended approach, CRAN is a vast improvement.

Paul
>
> Mark
>
> Mark Bravington
> CSIRO CMIS
> Marine Lab
> Hobart
> Australia
> ________________________________________
> From:r-devel-bounces_at_r-project.org [r-devel-bounces_at_r-project.org]
On Behalf Of Hadley Wickham [hadley_at_rice.edu]
> Sent: 30 March 2012 07:42
> To: William Dunlap
> Cc:r-devel_at_stat.math.ethz.ch; Spencer Graves
> Subject: Re: [Rd] CRAN policies
>
>> Most of that stuff is already in codetools, at least when it is
checking functions
>> with checkUsage(). E.g., arguments of ~ are not checked. The expr
argument
>> to with() will not be checked if you add skipWith=FALSE to the call
to checkUsage.
>>
>> > library(codetools)
>>
>> > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp
~ Pred}))
>> <anonymous>: no visible binding for global variable 'Num' (:1)
>> <anonymous>: no visible binding for global variable 'Den' (:1)
>>
>> > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp
~ Pred}), skipWith=TRUE)
>>
>> > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp
~ Pred}), skipWith=TRUE)
>> <anonymous>: no visible binding for global variable 'DataFrame'
>>
>> The only part that I don't see is the mechanism to add code-walker
functions to
>> the environment in codetools that has the standard list of them for
functions with
>> nonstandard evaluation:
>> > objects(codetools:::collectUsageHandlers, all=TRUE)
>> [1] "$" "$<-" ".Internal"
>> [4] "::" ":::" "@"
>> [7] "@<-" "{" "~"
>> [10] "<-" "<<-" "="
>> [13] "assign" "binomial" "bquote"
>> [16] "data" "detach" "expression"
>> [19] "for" "function" "Gamma"
>> [22] "gaussian" "if" "library"
>> [25] "local" "poisson" "quasi"
>> [28] "quasibinomial" "quasipoisson" "quote"
>> [31] "Quote" "require" "substitute"
>> [34] "with"
> It seems like we really need a standard way to add metadata to functions:
>
> attr(with, "special_args")<- "expr"
> attr(lm, "special_args")<- c("formula", "weights", "subset")
>
> This would be useful because it could automatically contribute to the
> documentation.
>
> Similarly,
>
> attr(my.new.method, "s3method")<- c("my.new", "method")
>
> could be useful.
>
> Hadley
>
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 30 Mar 2012 - 03:41:58 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 30 Mar 2012 - 05:50:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive