[Rd] Changing arguments inside .Call. Wise to encourage "const" on all arguments?

From: Paul Johnson <pauljohn32_at_gmail.com>
Date: Mon, 10 Dec 2012 00:51:10 -0600

I'm continuing my work on finding speedups in generalized inverse calculations in some simulations. It leads me back to .C and .Call, and some questions I've never been able to answer for myself. It may be I can push some calculations to LAPACK in or C BLAS, that's why I realized again I don't understand the call by reference or value semantics of .Call

Why aren't users of .Call encouraged to "const" their arguments, and why doesn't .Call do this for them (if we really believe in return by value)?

6qlink2">name="2466qlink2"> name="2467qlink2"><a name="2466qlink2">R Gentleman's R Programming for Bioinformatics is the most understandable treatment I've found on .Call. It appears to me .Call leaves "wiggle room" where there should be none. Here's Gentleman on p. 185. "For .Call and .External, the return value is an R object (the C functions must return a SEXP), and for these functions the values that were passed are typically not modified. If they must be modified, then making a copy in R, prior to invoking the C code, is necessary."

I *think* that means:

.Call allows return by reference, BUT we really wish users would not use it. Users can damage R data structures that are pointed to unless they really truly know what they are doing on the C side. ??

This seems dangerous. Why allow return by reference at all?

On p. 197, there's a similar comment "Any function that has been invoked by either .External or .Call will have all of its arguments protected already. You do not need to protect them. .... [T]hey were not duplicated and should be treated as read-only values."

"should be ... read-only" concerns me. They are "protected" in the garbage collector sense, but they are not protected from "return by reference" damage. Right?

6qlink5">name="2466qlink5"> name="2467qlink5"><a name="2466qlink5">Why doesn't the documentation recommend function writers to mark arguments to C functions as const? Isn't that what the return by value policy would suggest?

Here's a troublesome example in R src/main/array.c:

/* DropDims strips away redundant dimensioning information. */
/* If there is an appropriate dimnames attribute the correct */
/* element is extracted and attached to the vector as a names */
/* attribute. Note that this function mutates x. */
/* Duplication should occur before this is called. */

SEXP DropDims(SEXP x)

    SEXP dims, dimnames, newnames = R_NilValue;     int i, n, ndims;

   dims = getAttrib(x, R_DimSymbol);
[... SNIP ....]

    setAttrib(x, R_DimNamesSymbol, R_NilValue);
    setAttrib(x, R_DimSymbol, R_NilValue);
    setAttrib(x, R_NamesSymbol, newnames);
[... SNIP ....] return x;

Well, at least there's a warning comment with that one. But it does show .Call allows call by reference.

Why allow it, though? DropDims could copy x, modify the copy, and return it.

I wondered why DropDims bothers to return x at all. We seem to be using modify and return by reference there.

I also wondered why x is PROTECTED, actually. Its an argument, wasn't it automatically protected? Is it no longer protected after the function starts modifying it?

Here's an example with similar usage in Writing R Extensions, section 5.10.1 "Calling .Call". It protects the arguments a and b (needed ??), then changes them.

#include <R.h>
#include <Rdefines.h>

     SEXP convolve2(SEXP a, SEXP b)
         R_len_t i, j, na, nb, nab;
         double *xa, *xb, *xab;
         SEXP ab;

         PROTECT(a = AS_NUMERIC(a)); /* PJ wonders, doesn't this alter
"a"  in calling code*/
         PROTECT(b = AS_NUMERIC(b));
         na = LENGTH(a); nb = LENGTH(b); nab = na + nb - 1;
         PROTECT(ab = NEW_NUMERIC(nab));
         xa = NUMERIC_POINTER(a); xb = NUMERIC_POINTER(b);
         xab = NUMERIC_POINTER(ab);
         for(i = 0; i < nab; i++) xab[i] = 0.0;
         for(i = 0; i < na; i++)
              for(j = 0; j < nb; j++) xab[i + j] += xa[i] * xb[j];


        PROTECT(a = AS_NUMERIC(a));

have the alter the data structure "a" both inside the C function and in the calling R code as well? And, if a was PROTECTED automatically, could we do without that PROTECT()?


Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu

R-devel_at_r-project.org mailing list
Received on Mon 10 Dec 2012 - 06:55:12 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 10 Dec 2012 - 21:52:54 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive