Re: [Rd] R-devel Digest, Vol 83, Issue 2

From: Romain Francois <romain.francois_at_dbmail.com>
Date: Sun, 03 Jan 2010 00:14:17 +0100

On 01/02/2010 11:41 PM, Romain Francois wrote:
>
> On 01/02/2010 11:12 PM, Duncan Murdoch wrote:
>>
>> On 02/01/2010 3:16 PM, Laurent Gautier wrote:
>>> On 1/2/10 8:53 PM, Duncan Murdoch wrote:
>>>> Simon Urbanek wrote:
>>>>> On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
>>>>>
>>>>>> On 1/2/10 5:56 PM, Duncan Murdoch wrote:
>>>>>>> On 02/01/2010 11:36 AM, Laurent Gautier wrote:
>>>>>>>> [Disclaimer: what is below reflects my understanding from reading
>>>>>>>> the
>>>>>>>> R source, others will correct where deemed necessary]
>>>>>>>>
>>>>>>>> On 1/2/10 12:00 PM, r-devel-request_at_r-project.org wrote:
>>>>>> (...)
>>>>>>
>>>>>>>>> I'd also be interested if there is some ideas on the relative
>>>>>>>>> efficiency
>>>>>>>>> of the preserve/release mechanism compared to PROTECT/UNPROTECT.
>>>>>>>> PROTECT/UNPROTECT is trading granularity for speed. It is a stack
>>>>>>>> with
>>>>>>>> only tow operations possible:
>>>>>>>> - push 1 object into the stack
>>>>>>>> - pull (unprotect) N last objects from the stack
>>>>>>> UNPROTECT_PTR is also possible, which does a linear search through
>>>>>>> the
>>>>>>> stack and unprotects something possibly deep within it. There is
>>>>>>> also
>>>>>>> REPROTECT which allows you to replace an entry within the stack.
>>>>>>>
>>>>>>> I would guess that UNPROTECT_PTR is more efficient than
>>>>>>> RecursiveRelease
>>>>>>> because it doesn't use so much stack space when it needs to go deep
>>>>>>> into
>>>>>>> the stack to release, but it is possible the compiler recognizes the
>>>>>>> tail recursion and RecursiveRelease is implemented efficiently. In
>>>>>>> that
>>>>>>> case it could be more efficient than UNPROTECT_PTR, which has to
>>>>>>> move
>>>>>>> all the other entries down to fill the newly vacated space. Really
>>>>>>> the
>>>>>>> only reliable way to answer efficiency questions like this is to try
>>>>>>> both ways and see which works better in your application.
>>>>>> Thanks. I did not know about UNPROTECT_PTR.
>>>>>> I had concerns over the stack usage, but so far it did not prove too
>>>>>> much of a problem. Still, why isn't the tail recursion explicitly
>>>>>> made an iteration then ? This would take the "may be the compiler
>>>>>> figures it out, may be not" variable out of the equation.
>>>>>>
>>>>> Careful - the protection stack (bookkeeping by R) has nothing to do
>>>>> with the C function call stack hence it has nothing to do with the
>>>>> compiler. The protection stack is global so usually you don't run out
>>>>> of it unless something goes horribly wrong (=infinite loop).
>>>> I think Laurent was referring to RecursiveRelease, which could use a
>>>> lot
>>>> of C stack if you choose to release something that is deep in the list,
>>>> since it checks the head, and if that doesn't match, calls itself again
>>>> on the rest of the list. (I checked, and at least one version of gcc
>>>> doesn't recognize the tail recursion: it really does generate a
>>>> recursive call.)
>>>>
>>>> Laurent asked why it isn't optimized to avoid the recursion: I think
>>>> the
>>>> answer is simply because it is so rarely used that nobody has bothered.
>>>
>>> Yes, I was referring to RecursiveRelease. Sorry if this was not clear.
>>>
>>> What are the chances for a patch to be accepted ? At first sight(*),
>>> making that tail recursion an iterative function is not a major
>>> undertaking, and reviewing the patch be fairly straightforward... but
>>> I can always use that time otherwise if the answer to the question is
>>> "nil".
>>
>> I don't think I would want to review such a patch (I don't know the
>> memory manager well, I don't know that there is really a case where it
>> matters enough to be worth doing), so I'd say if you don't get a message
>> from a core member volunteering to do so, you should assume it won't be
>> accepted. But that doesn't mean you shouldn't write the code for your
>> own internal use and edification, and if you can put together a demo
>> that shows it really makes a big difference in a realistic situation,
>> you might get a different response.
>>
>> Duncan Murdoch
>
> From what I understand, this has little to do with the memory manager,
> and resumes to the simple problem "how to remove an object from a list".
>
> Something like this, using the amazing inline and inspect packages:

>
> require( inline )
> require( inspect )
>
> remover <- cfunction(signature( list = "language", object =
> "environment" ), '
> if( !isNull( list ) ){
> SEXP x = list ;
> SEXP y ;
> while( CAR(x) != object && CADR(x) != R_NilValue ){
> y = x ;
> x = CDR(x) ;
> }
> if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
> }
> return list ;
> ', Rcpp=FALSE, verbose=FALSE )
>
> e <- new.env()
> call <- call( "foo", e, e, 1:10, 3 )
> call
> # inspect( call )
> result <- remover( call ,e )
> result
> # inspect( result )
>
> gives this :
>
> foo(10, <environment>, 0, <environment>, 1)
> @0x9f4e0d0 06 LANGSXP [NAM(2)]
> @0x9f4e204 01 SYMSXP [] "foo"
> @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
> @0x9f4d564 04 ENVSXP [NAM(1)]
> FRAME:
> @0x9e94a10 00 NILSXP [MARK,NAM(2)]
> ENCLOS:
> @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
> HASHTAB:
> @0x9e94a10 00 NILSXP [MARK,NAM(2)]
> @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
> @0x9f4d564 04 ENVSXP [NAM(1)]
> FRAME:
> @0x9e94a10 00 NILSXP [MARK,NAM(2)]
> ENCLOS:
> @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
> HASHTAB:
> @0x9e94a10 00 NILSXP [MARK,NAM(2)]
> @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
> NULL
> ==================
> foo(10, 0, <environment>, 1)
> @0x9f4e0d0 06 LANGSXP [NAM(2)]
> @0x9f4e204 01 SYMSXP [] "foo"
> @0xa907b78 14 REALSXP [NAM(2)] (len=1, tl=1763713056)
> @0xa907b58 14 REALSXP [NAM(2)] (len=1, tl=1936941344)
> @0x9f4d564 04 ENVSXP [NAM(2)]
> FRAME:
> @0x9e94a10 00 NILSXP [MARK,NAM(2)]
> ENCLOS:
> @0x9eb7b4c 04 ENVSXP [MARK,NAM(2),GP(0x8000)]
> HASHTAB:
> @0x9e94a10 00 NILSXP [MARK,NAM(2)]
> @0xa907b38 14 REALSXP [NAM(2)] (len=1, tl=543908709)
> NULL
>
>
>
> so it boils down to this as a replacement to RecursiveRelease :
>
> /**
> * Removes the first instance of object with the list
> */
> static SEXP RemoveFromList( SEXP object, SEXP list){
> if( !isNull( list ) ){
> SEXP x = list ;
> SEXP y ;
> while( CAR(x) != object && CADR(x) != R_NilValue ){
> y = x ;
> x = CDR(x) ;
> }
> if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
> }
> return list ;
> }
>
>
> Romain

Going a bit further, attached is an R script that contains c code for both methods, generation of some "big" list and calling each method to remove the object in the worst case scenario (object to remove is at the end of the list) :

[romain_at_santorini tmp]$ Rscript release.R 10000 gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/include -fpic   -g -O2 -c release.c -o release.o
gcc -std=gnu99 -shared -L/usr/local/lib -o release.so release.o -L/usr/local/lib/R/lib -lR
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/include -fpic   -g -O2 -c remove.c -o remove.o
gcc -std=gnu99 -shared -L/usr/local/lib -o remove.so remove.o -L/usr/local/lib/R/lib -lR

    user system elapsed

       0 0 0
    user system elapsed
   0.000 0.001 0.001

Then it starts to be "noticeable" :

[romain_at_santorini tmp]$ Rscript release.R 100000 gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/include -fpic   -g -O2 -c release.c -o release.o
gcc -std=gnu99 -shared -L/usr/local/lib -o release.so release.o -L/usr/local/lib/R/lib -lR
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/include -fpic   -g -O2 -c remove.c -o remove.o
gcc -std=gnu99 -shared -L/usr/local/lib -o remove.so remove.o -L/usr/local/lib/R/lib -lR

    user system elapsed
   0.001 0.000 0.002
    user system elapsed
   0.007 0.004 0.013

then :

[romain_at_santorini tmp]$ Rscript release.R 500000 gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/include -fpic   -g -O2 -c release.c -o release.o
gcc -std=gnu99 -shared -L/usr/local/lib -o release.so release.o -L/usr/local/lib/R/lib -lR
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/include -fpic   -g -O2 -c remove.c -o remove.o
gcc -std=gnu99 -shared -L/usr/local/lib -o remove.so remove.o -L/usr/local/lib/R/lib -lR

    user system elapsed
   0.008 0.000 0.008
Error: segfault from C stack overflow
Timing stopped at: 0.006 0.013 0.021
Execution halted

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/IW9B : C++ exceptions at the R level
|- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex


______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel

Received on Sat 02 Jan 2010 - 23:20:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 03 Jan 2010 - 00:30:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive