[Rd] faster base::sequence

From: Romain Francois <romain_at_r-enthusiasts.com>
Date: Sun, 28 Nov 2010 09:45:35 +0100


Hello,

Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence:

> sequence

function (nvec)
unlist(lapply(nvec, seq_len))
<environment: namespace:base>

could benefit from being written in C to avoid unnecessary memory allocations.

I made this version using inline:

require( inline )
sequence_c <- local( {

     fx <- cfunction( signature( x = "integer"), '
         int n = length(x) ;
         int* px = INTEGER(x) ;
         int x_i, s = 0 ;
         /* error checking */
         for( int i=0; i<n; i++){
             x_i = px[i] ;
             /* this includes the check for NA */
             if( x_i <= 0 ) error( "needs non negative integer" ) ;
             s += x_i ;
         }

         SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
         int * p_res = INTEGER(res) ;
         for( int i=0; i<n; i++){
             x_i = px[i] ;
             for( int j=0; j<x_i; j++, p_res++)
                 *p_res = j+1 ;
         }
         UNPROTECT(1) ;
         return res ;
     ' )
     function( nvec ){
         fx( as.integer(nvec) )
     }

})

And here are some timings:

> x <- 1:10000
> system.time( a <- sequence(x ) )

utilisateur     système      écoulé
       0.191       0.108       0.298

> system.time( b <- sequence_c(x ) )
utilisateur système écoulé 0.060 0.063 0.122

> identical( a, b )

[1] TRUE
> system.time( for( i in 1:10000) sequence(1:10) )
utilisateur     système      écoulé
       0.119       0.000       0.119

 >
> system.time( for( i in 1:10000) sequence_c(1:10) )
utilisateur     système      écoulé
       0.019       0.000       0.019


I would write a proper patch if someone from R-core is willing to push it.

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sun 28 Nov 2010 - 08:48:21 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 28 Nov 2010 - 10:00:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive