[Rd] 1.6x speedup for requal() function (in R/src/main/unique.c)

From: Hervé Pagès <hpages_at_fhcrc.org>
Date: Thu, 01 Dec 2011 17:40:34 -0800


FWIW: /* Taken from R/src/main/unique.c */
static int requal(SEXP x, int i, SEXP y, int j)

     if (i < 0 || j < 0) return 0;
     if (!ISNAN(REAL(x)[i]) && !ISNAN(REAL(y)[j]))
         return (REAL(x)[i] == REAL(y)[j]);
     else if (R_IsNA(REAL(x)[i]) && R_IsNA(REAL(y)[j])) return 1;
     else if (R_IsNaN(REAL(x)[i]) && R_IsNaN(REAL(y)[j])) return 1;
     else return 0;


/* Between 1.34x and 1.37x faster on my 64-bit Ubuntu laptop */ static int requal2(SEXP x, int i, SEXP y, int j)

     double xi, yj;

     if (i < 0 || j < 0) return 0;
     xi = REAL(x)[i];
     yj = REAL(y)[j];
     if (!ISNAN(xi) && !ISNAN(yj)) return xi == yj;
     if (R_IsNA(xi) && R_IsNA(yj)) return 1;
     if (R_IsNaN(xi) && R_IsNaN(yj)) return 1;
     return 0;


/* Another extra 1.18x speedup. So overall requal3() is about 1.6x

    faster than requal() for me. requal3() uses a simpler logic than     requal() but this logic should be equivalent to the logic used     by requal(), based on the following facts:

      (a) If *one* of xi or yi is a number (i.e. not NA or NaN),
          then xi and yi can be compared with xi == yi. They don't
          need to *both* be numbers for this comparison to be valid.
      (b) Otherwise (i.e. if each of them is not a number) then each
          of them is either NA or NaN (only 2 possible values for
          each), so comparing them with R_IsNA(xi) == R_IsNA(yj)
          should do the trick. */

static int requal3(SEXP x, int i, SEXP y, int j)

     double xi, yj;

     if (i < 0 || j < 0) return 0;
     xi = REAL(x)[i];
     yj = REAL(y)[j];
     if (!ISNAN(xi) || !ISNAN(yj)) return xi == yj;
     return R_IsNA(xi) == R_IsNA(yj);


The logic of the cequal() function (in the same file) could also be cleaned up in a similar way, probably for an even greater speedup.

This will benefit duplicated(), anyDuplicated() and unique() on numeric and complex vectors.


Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages_at_fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

R-devel_at_r-project.org mailing list
Received on Fri 02 Dec 2011 - 01:47:46 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 02 Dec 2011 - 11:30:13 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive