From: Duncan Murdoch <murdoch_at_stats.uwo.ca>

Date: Tue 19 Sep 2006 - 10:45:11 GMT

*>
*

> sample(t, 1) is a sample from 1:t, not 0:t.

*>
*

*> You need
*

*>
*

*> sample(t+1, 1, replace=TRUE) - 1
*

*>
*

*> which works in all cases up to INT_MAX-1, and beyond that you need to
*

*> worry about the resolution of the RNG (and to use floor not as.integer).
*

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Sep 19 20:56:23 2006

Date: Tue 19 Sep 2006 - 10:45:11 GMT

On 9/19/2006 4:41 AM, Prof Brian Ripley wrote:

> On Tue, 19 Sep 2006, Sean O'Riordain wrote:

*>
*

>> Hi Duncan, >> >> Thanks for that. In the light of what you've suggested, I'm now using >> the following: >> >> # generate a random integer from 0 to t (inclusive) >> if (t < 10000000) { # to avoid memory problems... >> M <- sample(t, 1) >> } else { >> while (M > t) { >> M <- as.integer(urand(1,min=0, max=t+1-.Machine$double.eps)) >> } >> }

> sample(t, 1) is a sample from 1:t, not 0:t.

I wonder if it would be a worthwhile optimization to treat replace as TRUE whenever size=1 is requested.

- It would be a very cheap test in the C code, and would make a large difference to the size=1 run time when n is very large.
- On the other hand, using size=1 is usually not a very efficient way to program anything, so anyone who does it might not notice the gain...

Duncan Murdoch

*>
*

> There is no such thing as urand in base R ....

*>
*

>> cheers and Thanks, >> Sean >> >> On 18/09/06, Duncan Murdoch <murdoch@stats.uwo.ca> wrote: >>> On 9/18/2006 3:37 AM, Sean O'Riordain wrote: >>>> Good morning, >>>> >>>> I'm trying to concisely generate a single integer from 0 to n >>>> inclusive, where n might be of the order of hundreds of millions. >>>> This will however be used many times during the general procedure, so >>>> it must be "reasonably efficient" in both memory and time... (at some >>>> later stage in the development I hope to go vectorized) >>>> >>>> The examples I've found through searching RSiteSearch() relating to >>>> generating random integers say to use : sample(0:n, 1) >>>> However, when n is "large" this first generates a large sequence 0:n >>>> before taking a sample of one... this computer doesn't have the memory >>>> for that! >>> You don't need to give the whole vector: just give n, and you'll get >>> draws from 1:n. The man page is clear on this. >>> >>> So what you want is sample(n+1, 1) - 1. (Use "replace=TRUE" if you want >>> a sample bigger than 1, or you'll get sampling without replacement.) >>>> When I look at the documentation for runif(n, min, max) it states that >>>> the generated numbers will be min <= x <= max. Note the "<= max"... >>> Actually it says that's the range for the uniform density. It's silent >>> on the range of the output. But it's good defensive programming to >>> assume that it's possible to get the endpoints. >>> >>>> How do I generate an x such that the probability of being (the >>>> integer) max is the same as any other integer from min (an integer) to >>>> max-1 (an integer) inclusive... My attempt is: >>>> >>>> urand.int <- function(n,t) { >>>> as.integer(runif(n,min=0, max=t+1-.Machine$double.eps)) >>>> } >>>> # where I've included the parameter n to help testing... >>> Because of rounding error, t+1-.Machine$double.eps might be exactly >>> equal to t+1. I'd suggest using a rejection method if you need to use >>> this approach: but sample() is better in the cases where as.integer() >>> will work. >>> >>> Duncan Murdoch >>>> is floor() "better" than as.integer? >>>> >>>> Is this correct? Is the probability of the integer t the same as the >>>> integer 1 or 0 etc... I have done some rudimentary testing and this >>>> appears to work, but power being what it is, I can't see how to >>>> realistically test this hypothesis. >>>> >>>> Or is there a a better way of doing this? >>>> >>>> I'm trying to implement an algorithm which samples into an array, >>>> hence the need for an integer - and yes I know about sample() thanks! >>>> :-) >>>> >>>> { incidentally, I was surprised to note that the maximum value >>>> returned by summary(integer_vector) is "pretty" and appears to be >>>> rounded up to a "nice round number", and is not necessarily the same >>>> as max(integer_vector) where the value is large, i.e. of the order of >>>> say 50 million } >>>> >>>> Is version etc relevant? (I'll want to be portable) >>>>> version _ >>>> platform i386-pc-mingw32 >>>> arch i386 >>>> os mingw32 >>>> system i386, mingw32 >>>> status >>>> major 2 >>>> minor 3.1 >>>> year 2006 >>>> month 06 >>>> day 01 >>>> svn rev 38247 >>>> language R >>>> version.string Version 2.3.1 (2006-06-01) >>>> >>>> Many thanks in advance for your help. >>>> Sean O'Riordain >>>> affiliation <- NULL >>>> >>>> ______________________________________________ >>>> R-help@stat.math.ethz.ch mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >> ______________________________________________ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Sep 19 20:56:23 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Tue 19 Sep 2006 - 11:30:05 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*