[Rd] suggestion how to use memcpy in duplicate.c

From: Matthew Dowle <mdowle_at_mdowle.plus.com>
Date: Wed, 21 Apr 2010 16:54:01 +0100


>From copyVector in duplicate.c :

void copyVector(SEXP s, SEXP t)
{

    int i, ns, nt;
    nt = LENGTH(t);
    ns = LENGTH(s);
    switch (TYPEOF(s)) {
...

    case INTSXP:
    for (i = 0; i < ns; i++)

        INTEGER(s)[i] = INTEGER(t)[i % nt];     break;
...

could that be replaced with :

    case INTSXP:
    for (i=0; i<ns/nt; i++)

        memcpy((char *)DATAPTR(s)+i*nt*sizeof(int), (char *)DATAPTR(t), nt*sizeof(int));

    break;

and similar for the other types in copyVector. This won't help regular vector copies, since those seem to be done by the DUPLICATE_ATOMIC_VECTOR macro, see next suggestion below, but it should help copyMatrix which calls copyVector, scan.c which calls copyVector on three lines, dcf.c (once) and dounzip.c (once).

For the DUPLICATE_ATOMIC_VECTOR macro there is already a comment next to it :

    <FIXME>: surely memcpy would be faster here?

which seems to refer to the for loop :

    else { \
    int __i__; \

    type *__fp__ = fun(from), *__tp__ = fun(to); \
    for (__i__ = 0; __i__ < __n__; __i__++) \
      __tp__[__i__] = __fp__[__i__]; \

  } \

Could that loop be replaced by the following ?

   else { \
   memcpy((char *)DATAPTR(to), (char *)DATAPTR(from), __n__*sizeof(type)); \    }\

In the data.table package, dogroups.c uses this technique, so the principle is tested and works well so far.

Are there any road blocks preventing this change, or is anyone already working on it ? If not then I'll try and test it (on Ubuntu 32bit) and submit patch with timings, as before. Comments/pointers much appreciated.

Matthew



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 21 Apr 2010 - 16:01:33 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 21 Apr 2010 - 19:40:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive