Re: [Rd] memory misuse in subscript code when rep() is called in odd way

From: Seth Falcon <seth_at_userprimary.net>
Date: Tue, 03 Nov 2009 21:40:04 -0800

Hi,

On 11/3/09 2:28 PM, William Dunlap wrote:
> The following odd call to rep()
> gives somewhat random results:
>
>> rep(1:4, 1:8, each=2)

I've committed a fix for this to R-devel.

I admit that I had to reread the rep man page as I first thought this was not a valid call to rep since times (1:8) is longer than x (1:4), but closer reading of the man page says:

   > If times is a vector of the same length as x (after replication
   > by each), the result consists of x[1] repeated times[1] times,
   > x[2] repeated times[2] times and so on.

So the expected result is the same as rep(rep(1:4, each=2), 1:8).

> valgrind says that the C code is using uninitialized data:
>> rep(1:4, 1:8, each=2)
> ==26459== Conditional jump or move depends on uninitialised value(s)
> ==26459== at 0x80C557D: integerSubscript (subscript.c:408)
> ==26459== by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)

A little investigation seems to suggest that the problem is originating earlier. Debugging in seq.c:do_rep I see the following:

 > rep(1:4, 1:8, each=2)

Breakpoint 1, do_rep (call=0x102de0068, op=<value temporarily unavailable, due to optimizations>, args=<value temporarily unavailable, due to optimizations>, rho=0x1018829f0) at /Users/seth/src/R-devel-all/src/main/seq.c:434 434 ans = do_subset_dflt(R_NilValue, R_NilValue, list2(x, ind), rho);
(gdb) p Rf_PrintValue(ind)

  [1]          1          1          1          2          2          2
  [7]          2          2          2          2          3          3
[13]          3          3          3          3          3          3
[19]          3          3          3          4          4          4
[25]          4          4          4          4          4          4
[31]          4          4          4          4          4          4
[37]   44129344          1   44129560          1   44129776          1
[43]   44129992          1   44099592          1   44099808          1
[49]   44100024          1   44100456          1    2724144    3801089
[55] -536870733          0   54857992          1   22275728          1
[61]    2724144          1         34          1   44100744          1
[67]   44100960          1   44101176          1   43652616          1
$2 = void
(gdb) c
Continuing.
Error: only 0's may be mixed with negative subscripts

The patch I applied adjusts how the index vector length is computed when times has length more than one.

+ seth



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 04 Nov 2009 - 05:48:09 GMT

This archive was generated by hypermail 2.2.0 : Wed 04 Nov 2009 - 10:20:20 GMT