From: Thomas Lumley <tlumley_at_uw.edu>

Date: Mon, 20 Feb 2012 08:04:28 +1300

Date: Mon, 20 Feb 2012 08:04:28 +1300

On Sat, Feb 18, 2012 at 4:33 PM, Paul Johnson <pauljohn32_at_gmail.com> wrote:

> On Fri, Feb 17, 2012 at 5:06 PM, Petr Savicky <savicky@cs.cas.cz> wrote:

*>> On Fri, Feb 17, 2012 at 02:57:26PM -0600, Paul Johnson wrote:
**>> Hi.
**>>
**>> Some of the random number generators allow as a seed a vector,
**>> not only a single number. This can simplify generating the seeds.
**>> There can be one seed for each of the 1000 runs and then,
**>> the rows of the seed matrix can be
**>>
**>> c(seed1, 1), c(seed1, 2), ...
**>> c(seed2, 1), c(seed2, 2), ...
**>> c(seed3, 1), c(seed3, 2), ...
**>> ...
**>>
**> Yes, I understand.
**>
**> The seed things I'm using are the 6 integer values from the L'Ecuyer.
**> If you run the example script, the verbose option causes some to print
**> out. The first 3 seeds in a saved project seeds file looks like:
**>
**>> projSeeds[[1]]
**> [[1]]
**> [1] 407 376488316 1939487821 1433925148 -1040698333 579503880
**> [7] -624878918
**>
**> [[2]]
**> [1] 407 -1107332181 854177397 1773099324 1774170776 -266687360
**> [7] 816955059
**>
**> [[3]]
**> [1] 407 936506900 -1924631332 -1380363206 2109234517 1585239833
**> [7] -1559304513
**>
**> The 407 in the first position is an integer R uses to note the type of
**> stream for which the seed is intended, in this case R'Lecuyer.
**>
**>
**>
**>
**>
**>> There could be even only one seed and the matrix can be generated as
**>>
**>> c(seed, 1, 1), c(seed, 1, 2), ...
**>> c(seed, 2, 1), c(seed, 2, 2), ...
**>> c(seed, 3, 1), c(seed, 3, 2), ...
**>>
**>> If the initialization using the vector c(seed, i, j) is done
**>> with a good quality hash function, the runs will be independent.
**>>
**> I don't have any formal proof that a "good quality hash function"
**> would truly create seeds from which independent streams will be drawn.
*

That is essentially the *definition* of a good quality hash function, at least in the cryptographic sense. It maps the inputs into numbers that are indistinguishable from uniform random except that the same input always gives the same output.

What's harder is to prove that you *have* a good quality hash function, but for these (non-adversarial) purposes even something like MD4 would be fine, and certainly the SHA family.

-thomas

-- Thomas Lumley Professor of Biostatistics University of Auckland ______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-develReceived on Sun 19 Feb 2012 - 19:06:17 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

*
Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.
Archive generated by hypermail 2.2.0, at Sun 19 Feb 2012 - 20:10:19 GMT.
*

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*