Date: Sun, 28 Nov 2010 11:44:56 +0100

Le 28/11/10 11:30, Prof Brian Ripley a écrit :

Le 28/11/10 10:30, Prof Brian Ripley a écrit :

Is sequence used enough to warrant this? As the help page says
**>>> Note that ‘sequence <- function(nvec) unlist(lapply(nvec,
**>>> seq_len))’ and it mainly exists in reverence to the very early
**>>> history of R.
I don't know. Would it be used more if it were more efficient ?
It is for you to make a compelling case for others to do work (maintain
changed code) for your wish.
**> changed code) for your wish.
No trouble. The patch is there, if anyone finds it interesting or compelling, they will speak up I suppose.

Otherwise it is fine for me if it ends up in no man's land. I have the code, if I want to use it, I can squeeze it in a package.

>>> I regard it as unsafe to assume that NA_INTEGER will always be negative,

*>>> and bear in mind that at some point not so far off R integers (or at
**>>> least lengths) will need to be more than 32-bit.
**>> sure. updated and dressed up as a patch.
**>> I've made it a .Call because I'm not really comfortable with
**>> .Internal, etc ...
**>> Do you mean that I should also use something else instead of "int" and
**>> "int*". Is there some future proof typedef or macro for the type
**>> associated with INTSXP ?
Not yet. I was explaining why NA_INTEGER might change.
sure. thanks for the reminder.

On Sun, 28 Nov 2010, Romain Francois wrote:

**>>>> Hello,
**>>>> Based on yesterday's R-help thread (help: program efficiency), and
**>>>> following Bill's suggestions, it appeared that sequence:
**>>>>
**>>>>> sequence
**>>>> function (nvec)
**>>>> unlist(lapply(nvec, seq_len))
**>>>> <environment: namespace:base>
**>>>> could benefit from being written in C to avoid unnecessary memory
**>>>> allocations.
**>>>> I made this version using inline:
**>>>>
**>>>> require( inline )
**>>>> sequence_c <- local( {
**>>>> fx <- cfunction( signature( x = "integer"), '
**>>>> int n = length(x) ;
**>>>> int* px = INTEGER(x) ;
**>>>> int x_i, s = 0 ;
**>>>> /* error checking */
**>>>> for( int i=0; i<n; i++){
**>>>> x_i = px[i] ;
**>>>> /* this includes the check for NA */
**>>>> if( x_i <= 0 ) error( "needs non negative integer" ) ;
**>>>> s += x_i ;
**>>>> }
**>>>> SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
**>>>> int * p_res = INTEGER(res) ;
**>>>> for( int i=0; i<n; i++){
**>>>> x_i = px[i] ;
**>>>> for( int j=0; j<x_i; j++, p_res++)
**>>>> *p_res = j+1 ;
**>>>> }
**>>>> UNPROTECT(1) ;
**>>>> return res ;
**>>>> ' )
**>>>> function( nvec ){
**>>>> fx( as.integer(nvec) )
**>>>> }
**>>>> })
**>>>>
**>>>> And here are some timings:
**>>>>
**>>>>> x <- 1:10000
**>>>>> system.time( a <- sequence(x ) )
**>>>> utilisateur système écoulé
**>>>> 0.191 0.108 0.298
**>>>>> system.time( b <- sequence_c(x ) )
**>>>> utilisateur système écoulé
**>>>> 0.060 0.063 0.122
**>>>>> identical( a, b )
**>>>> [1] TRUE
**>>>>> system.time( for( i in 1:10000) sequence(1:10) )
**>>>> utilisateur système écoulé
**>>>> 0.119 0.000 0.119
**>>>>> system.time( for( i in 1:10000) sequence_c(1:10) )
**>>>> utilisateur système écoulé
**>>>> 0.019 0.000 0.019
**>>>> I would write a proper patch if someone from R-core is willing to push
**>>>> it.
**>>>> Romain
