Re: [Rd] Wish list

From: Robert Gentleman <>
Date: Mon 01 Jan 2007 - 18:37:09 GMT

Gabor Grothendieck wrote:
> On 1/1/07, Duncan Murdoch <> wrote:

>> A few comments thrown in, and some general comments at the bottom.
>> On 1/1/2007 1:28 AM, Gabor Grothendieck wrote:
>>> This is my 2007 New Year wishlist for R features:
>>> 1. Matrix Multiplication
>>>    Enhance matrix multiplication to work with multidimensional
>>>    arrays such that the last dimension of the first multiplicand
>>>    must equal the first dimension of the second. See:
>>> 2. Grid
>>>    - logical-valued function as first arg of grid.edit
>>>    - transparency under Windows (not sure if this involves grid
>>>      or just the Windows graphics device)
>>>    - shading patterns
>>>    - more interactivity features
>>>    - safe way to get name of a grid object, e.g.
>>>         names.vpPath <- names.viewport <- function(x) x$name
>>>    - safe way to get children of a grid object
>>>         getChildren.viewport <- function(x) x$children
>>>      and the order; see:
>>>    - facility for using a name, viewport or vpPath interchangably
>>>      so that, for example, any of them can be specified in
>>>      in print.trellis(..., or draw.key(..., vp=...)
>>> 3. Lattice.
>>>    - make panel functions generic
>>>    - allow print.trellis args to be specified in xyplot, etc.
>>>    - shading patterns (once grid implements them)
>>>    - safe way to access lattice:::getStatus and lattice:::updateList
>>>    - allow name, viewport or vpPath to be specified in
>>>      arg of print.trellis (and vp= arg of draw.key?)
>>>    - document parameters, i.e. those output from trellis.par.get()
>>>    - support for groups in histogram
>>> 4. Higher level Windows clipboard functions.
>>>    Since R 2.3.0 R can handle non-text objects
>>> on the Windows clipboard.  We now need some higher
>>> level functionality that makes use of that
>>> to read in non-text from the clipboard.  For
>>> example, one can select a table on an HTML
>>> page in Internet Explorer and invoke copy
>>> and it will copy it to the clipboard in a
>>> non-text format.  If one invokes paste in
>>> Excel, Excel will automatically detect the
>>> non-text format and copy it in the expected
>>> way so that it appears in Excel one table
>>> cell per Excel cell.
>>> However, R does not currently
>>> support this level of integration. (Current
>>> workaround is to paste it into Excel and then copy
>>> it back out of Excel.  Excel will insert tabs between
>>> text that is so copied.)
>> R doesn't have HTML parsing built in, so this would be a fairly major
>> addition.  It's a much better idea to write a package to do this.  If
>> the R clipboard support is missing something that such a package would
>> need, that would be a reasonable addition to R.
>>> 6. Allow attributes to be associated with an environment
>>> variable without having them associated with the environment
>>> itself.  This would allow more powerful inheritance in
>>> the case of subclasses of environment.
>>> See:
>>> and subsequent postings in that thread.  Any package
>>> that uses the list(env = whatever) idiom to define
>>> objects could make use of this.
>> As I said in that thread, this is not a good suggestion.

> Yes, but I disagree with that assessment and I am not the
> only one.

   Nor is Duncan alone in his.

   best wishes


>>> 7. documentation standards for packages
>>>    - NEWS/ChangeLog (also should be accessible from CRAN page for package
>>>      and should be included in built version of package)
>>>    - package?mypackage
>> I don't understand the second part of this.  We already support a
>> package?mypackage topic, and recommend that people put it in.  Are you
>> saying packages should be rejected if they don't?  That's an awful lot
>> of work you're asking other people to do.

> There should be some guidelines as to what goes into mypackage-package.Rd .
>>> 8. Need to be able to distinguish between ordinary missing values
>>> and structurally missing ones.
>> I think this is something that you need to do in a different way.  There
>> are tons of possible semantics for what NA should mean.  I don't think
>> this should be made more complicated for everyone.

> Although one does not want to overcomplicate things the fact is that
> there are two issues here: structural and non-structural and trying to
> force them into a single construct is not simplifying -- rather it
> fails to model
> what is required adequately.
>>> 9. bidirectional pipes in Windows
>>> 10. Create a log updated at a regular frequency (daily or real time)
>>> that tracks all changes on CRAN, e.g.
>>>       Date(GMT)           Package Version Action
>>>       2006-09-20 21:22:01 mypkg   1.0.1   new
>>>       2006-09-20 22:00:23 mypkg2  0.2.1   updated
>>> 11. make integrate generic.  Ryacas could use that.
>>> 12. Remove all R CMD dependencies on the find.exe command.  find is a built
>>>     in command in Windows and having find.exe on my path causes
>>>     problems with other programs.
>> A simpler fix for this would be for you to define a wrapper for R CMD
>> that installed the R tools path before executing, and uninstalls it
>> afterwards.  But this is unnecessary for most people, because
>> Microsoft's find.exe is pretty rarely used.

> Anyone who uses batch files will use it quite a bit. It certainly causes
> me problems on an ongoing basis and is an unacceptable conflict in
> my opinion.
> I realize that its not entirely of R's doing but it would be best if R did not
> make it worse by requiring the use of find.
>>> 13. Make upper/lower case of simplify/SIMPLIFY consistent on all
>>>     apply commands and add a simplify= arg to by.
>> It would have been good not to introduce the inconsistency years ago,
>> but it's too late to change now.

> Its not too late to add it to by().
> Also note that the gsubfn package does have a workaround for this. In gsubfn
> one can preface any R function with fn$ and if that is done then the function
> can have a simplify= argument which fn$ intercepts and processes. e.g.
> library(gsubfn)
> fn$by(CO2[4:5], CO2[2], x ~ coef(lm(uptake ~ ., x)), simplify = rbind)
> fn$ can also interpret formulas as functions (and does quasi perl interpolation
> in strings) so the formula in the third argument is regarded to be the same
> as the anonymous function: function(x) coef(lm(uptake ~., x)) .
> More examples are in the gsubfn vignette.
>>> 14. better reporting of location of errors and warnings in R CMD check.
>> This is in the works, but probably not for 2.5.x.

> Great. This will be very welcome.
>>> 15. tcl tile library (needs tcl 8.5 or to be compiled in with 8.4)
>>> 16. extend aggregate to allow vector valued functions:
>>>     aggregate(CO2[4:5], CO2[1:2], function(x) c(mean = mean(x), sd = sd(x)))
>>>     [summaryBy in doBy package and cast in reshape package can already
>>>     do similar things but this seems sufficiently fundamental that it
>>>     ought to be in the base of R]
>>> 17. All OSes should support input= arg of system.
>>> My previous New Year wishlists are here:
>> To anyone still reading:
>> Many of the suggestions above would improve R, but they're unlikely to
>> happen unless someone volunteers to do them.  I'd suggest picking
>> whichever one of these or some other list that you think is the highest
>> priority, and post a specific proposal to this list about how to do it.
>>  If you get a negative response or no response, move on to the next
>> one, or put it into a contributed package instead.

> I think it works best when contributors develop their software in
> contributed packages since it avoids squabbles with the core group.
> The core group can then integrate these into R itself if it seems warranted.
>> When you make the proposal, consider how much work you're asking other
>> people to do, and how much you're volunteering to do yourself.  If
>> you're asking others to do a lot, then the suggestion had better be
>> really valuable to *them*.

> The implementation effort should not be a significant consideration in
> generating wish lists. What should be considered is what is really needed.
> Its better to know what you need and then later decide whether to implement
> it or not than to suppress articulating the need. Otherwise the development
> is driven by what is easy to do rather than what is needed.
> ______________________________________________
> mailing list
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024

______________________________________________ mailing list
Received on Tue Jan 02 05:40:53 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 01 Jan 2007 - 19:32:31 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.