From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>

Date: Fri, 15 Aug 2008 16:40:39 +0100 (BST)

Date: Fri, 15 Aug 2008 16:40:39 +0100 (BST)

On Fri, 15 Aug 2008, Shengqiao Li wrote:

*>
**> Professor Ripley,
**>
**> Thank you for your solution. So the last paragraph of the Note in RNG help
**> page will be updated since Wichmann-Hill is different from other supplied
**> uniform generators in the number of distinct values?
*

Please read and follow the R posting guide, and check the current version *before* posting as you were asked to do.

*> Shengqiao Li
**>
**>
**> On Fri, 15 Aug 2008, Prof Brian Ripley wrote:
**>
*

>> Remember Wichmann-Hill is a composite generator. Its composition does take

*>> more than 2^32 distinct values.
**>>
**>> You still haven't identifed a problem here. The note is to warn that
**>> runif() does repeat within a cycle, because people wrote code assuming
**>> otherwise. It would be a poor use of runif() to rely on the low-order
**>> bits, and that's standard advice in the field.
**>>
**>> For a large sample of uniforms use something like the normal inversion
**>> does, e.g. 2^(-30) * (runif(N, 0, 2^30) %% 2^30 + runif(N))
**>>
**>> Please do leave R-bugs out of this: we already have 4 entries as a result
**>> of your misunderstandings and false claims.
**>>
**>> On Thu, 14 Aug 2008, shli_at_stat.wvu.edu wrote:
**>>
**>>> This message is in MIME format. The first part should be readable text,
**>>> while the remaining parts are likely unreadable without MIME-aware tools.
**>>>
**>>> ---559023410-851401618-1218751024=:15885
**>>> Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
**>>> Content-Transfer-Encoding: QUOTED-PRINTABLE
**>>>
**>>>
**>>> I didn't describe the problem clearly. It's about the number of
**>>> distinct=20
**>>> values. So just ignore cycle issue.
**>>>
**>>> My tests were:
**>>>
**>>> RNGkind(kind=3D"Knuth-TAOCP");
**>>> sum(duplicated(runif(1e7))); #return 46552
**>>>
**>>> RNGkind(kind=3D"Knuth-TAOCP-2002");
**>>> sum(duplicated(runif(1e7))); #return 46415
**>>>
**>>> #These collision frequency suggested there were 2^30 distinct values by=20
**>>> birthday problem.
**>>>
**>>>
**>>> RNGkind(kind=3D"Marsaglia-Multicarry");
**>>> sum(duplicated(runif(1e7))); #return 11682
**>>>
**>>> RNGkind(kind=3D"Super-Duper");
**>>> sum(duplicated(runif(1e7))); #return 11542
**>>>
**>>> RNGkind(kind=3D"Mersenne-Twister");
**>>> sum(duplicated(runif(1e7))); #return 11656
**>>>
**>>> #These indicated there were 2^32 distinct values, which agrees with the=20
**>>> help info.
**>>>
**>>> RNGkind(kind=3D"Wichmann-Hill");
**>>> sum(duplicated(runif(1e7))); #return 0
**>>>
**>>> #So for this method, there should be more than 2^32 distinct values.
**>>>
**>>> You may not get the exact numbers, but they should be close. So how to=20
**>>> explain above problem?
**>>>
**>>> I need generate a large sample without any ties, it seems to me=20
**>>> "Wichmann-Hill" is only choice right now.
**>>>
**>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
**>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
**>>> Shengqiao Li
**>>>
**>>> The Department of Statistics
**>>> PO Box 6330
**>>> West Virginia University
**>>> Morgantown, WV 26506-6330
**>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
**>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
**>>>
**>>> On Thu, 14 Aug 2008, Peter Dalgaard wrote:
**>>>
**>>>> Shengqiao Li wrote:
**>>>>> Hello all,
**>>>>> =20
**>>>>> I am generating large samples of random numbers. The RNG help page
**>>>>> says:=
**>>> =20
**>>>>> "All the supplied uniform generators return 32-bit integer values that
**>>>>> a=
**>>> re=20
**>>>>> converted to doubles, so they take at most 2^32 distinct values and
**>>>>> long=
**>>> =20
**>>>>> runs will return duplicated values." But I find that the cycles are not
**>>>>> =
**>>> the=20
**>>>>> same as the 32-bit integer.
**>>>>> =20
**>>>>> My test indicated that the cycles for Knuth's methods were 2^30 while=20
**>>>>> Wichmann-Hill's cycle was larger than 2^32! No numbers were duplicated
**>>>>> i=
**>>> n=20
**>>>>> 10M numbers generated by runif using Wichmann-Hill. The other three
**>>>>> meth=
**>>> ods=20
**>>>>> had cycle length of 2^32.
**>>>>> =20
**>>>>> So, anybody can explain this? And any improvement to the implementation
**>>>>> =
**>>> can=20
**>>>>> be made to increase the cycle length like the Wichmann-Hill method?
**>>>>> =20
**>>>> What test? These are not simple linear congruential generators. Just
**>>>> beca=
**>>> use=20
**>>>> you get the same value twice, it doesn't mean that the sequence is
**>>>> repeat=
**>>> ing.=20
**>>>> Perhaps you should read the entire help page rather than just the note.
**>>>>
**>>>> --=20
**>>>> O__ ---- Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
**>>>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
**>>>> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
**>>>> ~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk) FAX: (+45) 35327907
**>>>>
**>>>>
**>>> ---559023410-851401618-1218751024=:15885--
**>>>
**>>> ______________________________________________
**>>> R-devel_at_r-project.org mailing list
**>>> https://stat.ethz.ch/mailman/listinfo/r-devel
**>>>
**>>
**>> --
**>> Brian D. Ripley, ripley_at_stats.ox.ac.uk
**>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
**>> University of Oxford, Tel: +44 1865 272861 (self)
**>> 1 South Parks Road, +44 1865 272866 (PA)
**>> Oxford OX1 3TG, UK Fax: +44 1865 272595
**>>
**>
*

-- Brian D. Ripley, ripley_at_stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-develReceived on Fri 15 Aug 2008 - 18:58:32 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 15 Aug 2008 - 20:38:11 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*