Re: [Rd] Change in the RNG implementation?

From: Duncan Murdoch <murdoch.duncan_at_gmail.com>
Date: Fri, 19 Oct 2012 19:26:39 -0400

On 12-10-19 7:04 PM, Hervé Pagès wrote:
> Hi,
>
> Looks like the implementation of random number generation changed in
> R-devel with respect to R-2.15.1.
>
> With R-2.15.1:
>
> > set.seed(33)
> > sample(49821115, 10)
> [1] 22217252 19661919 24099911 45779422 42043111 25774933 21778053
> 17098516
> [9] 773073 5878451
>
> With recent R-devel:
>
> > set.seed(33)
> > sample(49821115, 10)
> [1] 22217252 19661919 24099912 45779425 42043115 25774935 21778056
> 17098518
> [9] 773073 5878452
>
> This is on a 64-bit Ubuntu system.
>
> Is this change intended? I didn't see anything in the NEWS file.
>
> A potential problem with this is that it will break unit tests
> for algorithms that make use of RNG.
>
> Another more practical problem (at least for me) is the following:
> Bioconductor package maintainers are sometimes working hard on the
> development version of their package to improve the performance of
> some key functions. Comparing performance between BioC release
> (based on R-2.15) and devel (based on R-devel) often requires big
> input data that is randomly generated, because it's easiest than
> working with real data. Typically a small script is written that
> takes care of loading the required packages, generating the input
> data, and running a simple analysis. The same script is sourced in
> R-2.15 and R-devel, and performance and results are compared.
>
> Not being able to generate exactly the same input in the script is
> a problem. It can be worked around by generating the input once,
> serializing it, and use load() in the script, but that makes things
> more complicated and the script is not a standalone script anymore
> (cannot be passed around without also passing around the big .rda
> file).
>
> Thanks,
> H.
>

I think it was mentioned in the NEWS:

  \code{sample.int()} has some support for \eqn{n \ge   2^{31}}{n >= 2^31}: see its help for the limitations.

  A different algorithm is used for \code{(n, size, replace = FALSE,   prob = NULL)} for \code{n > 1e7} and \code{size <= n/2}. This   is much faster and uses less memory, but does give different results.

I don't think the old algorithm is available, but perhaps it could be made available by an optional parameter.

Duncan Murdoch



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 19 Oct 2012 - 23:31:10 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 21 Oct 2012 - 10:20:48 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive