Re: [R] Question about random sampling in R

From: Marc Schwartz <>
Date: Thu 19 Oct 2006 - 18:10:20 GMT

On Thu, 2006-10-19 at 12:07 -0500, tom soyer wrote:
> Hi,
> I looked up the help file on sample(), but didn't find the info I was
> looking for.
> When sample() is used to resample from a distribution, e.g., bootstrap, how
> does it do it? Does it use an uniform distribution, e.g., runif(), or
> something else? And, when the help file says:"sample(x) generates a random
> permutation of the elements of x (or 1:x)", would I be correct if I
> translate the statement as follows: it means that the order of
> sequence, which was generated from a uniform distribution, would look like a
> random normal distribution.

> Thanks,
> Tom

In the simplest case, where you have not specified a set of probability weights, sample() uses a uniform distribution, such that each element has an equal probability of being selected.

In the case of sampling WITHOUT replacement (the default), each element in the vector has an equal probability of being selected. Once selected, that element is removed from the sampling space and the process is repeated with the remaining elements until all elements have been selected.


> sample(10)

 [1] 3 8 5 9 7 1 4 2 10 6

yields a random permutation of 1:10.

In the case of 'replace = TRUE', which is sampling WITH replacement, after an element is selected it is retained in the sampling space, thus can be selected multiple times.


> sample(10, replace = TRUE)

 [1] 1 4 1 8 7 8 6 7 5 9

If you specify a set of probability weights from the sampling vector, then the probability for each element in being selected is affected accordingly.

In the case of bootstrapping, sampling WITH replacement is used. You might find the following post helpful in this scenario:

If you want to investigate further, you can review the C source code for the relevant R functions in random.c in the R source tarball. The file will be in ../src/main.

HTH, Marc Schwartz mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Fri Oct 20 04:21:07 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 20 Oct 2006 - 03:30:11 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.