From: tom soyer <tom.soyer_at_gmail.com>

Date: Fri 20 Oct 2006 - 14:32:38 GMT

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Oct 21 08:43:41 2006

Date: Fri 20 Oct 2006 - 14:32:38 GMT

Oh yes, you are right. It seems that the default distribution used for sampling is uniform. And whether the resampling generates a random distribution or not depends on the distribution being sampled.

Thanks everyone for your help! I appreciate your support very much.

Tom

On 10/19/06, Ted Harding <Ted.Harding@nessie.mcc.ac.uk> wrote:

*>
*

> On 19-Oct-06 tom soyer wrote:

*> > Hi,
**> >
**> > I looked up the help file on sample(), but didn't find the
**> > info I was looking for.
**> >
**> > When sample() is used to resample from a distribution, e.g.,
**> > bootstrap, how does it do it? Does it use an uniform distribution,
**> > e.g., runif(), or something else?
**>
**> I don't know the details of the algorithm, but since sample()
**> has flexible options it may be helpful to describe the effect
**> of sample() in different cases.
**>
**> 1. sample(x,r) where x is a vector of length n
**> In effect, the index values (1:n) of x are sampled from
**> without replacement (default) with a uniform probability
**> distribution over the available elements at all stages.
**> Hence, i1 is sampled from (1:n) with probability 1/n for
**> each possibility. Then i2 is sampled from the remainder
**> with probability 1/(n-1) for each, and so on until r items
**> (all distinct) have been sampled. If the resulting indices
**> are {i1,i2,...,ir} then the result is x[i1],x[i2],...,x[ir].
**> Thus, if some of the values in x[1],...,x[n] are equal,
**> you can get 2 or more items in the sample which are equal
**> even though the sampling is done without replacement (since
**> it is the indices which are sampled).
**> [NB I'm describing the *effect* here, not saying that this
**> is how the algorithm operates]
**>
**> 2. sample(x, replace=TRUE)
**> Similar to [1], except that the sampled index is returned
**> to the pool and is available to be sampled again, so at each
**> stage the probability of any value being chosen is 1/n.
**>
**> 3. sample(x, replace=TRUE, prob=p) where p is a vector of
**> probability weights (which must not all be 0, and none
**> negative).
**> First, p is converted into a probability distribution
**> (summing to 1) (in effect by dividing by the sum).
**> Then an index i1 is sampled from (1:n) with probability
**> p[i] that i is chosen. This is repeated (with previously
**> sampled i's still available) until r index values have been
**> sampled -- i1,...,ir. The result is x[i1],...,x[ir].
**>
**> 4. sample(x, prob=p) [without replacement]
**> First p is scaled to sum to 1, then i1 is sampled as in [3].
**> The remaining p-values are rescaled so as to sum to 1,
**> and i2 is sampled from the remaining i's; and so on.
**>
**> These are the essential variants of the use of sample().
**>
**> runif() can be used to sample i1 from (1:n) with equal
**> probabilities by selecting i if runif() is <= i and > (i-1)
**> for i = 1:n.
**>
**> Similarly runif() can be used to sample i1 from (1:n)
**> with probabilities p1,...,pn by selecting i if
**>
**> p[1] + ... + p[i-1] < runif() <= p[1] + ... + p[i]
**>
**> [LHS=0 if i=0], since the probability of this happening is p[i].
**>
**> > And, when the help file
**> > says:"sample(x) generates a random permutation of the elements
**> > of x (or 1:x)",
**>
**> Since the default value of r (size of sample) is the length
**> of x, say n, sample(x) (see [1] above) will sample n elements
**> without replacement from the n elements of x with uniform
**> probabilities at each stage. In effect, n elements i1,i2,...,in
**> will be sampled without replacement from (1:n), giving a
**> random permutation of (1:n), so the result x[i1],...,x[in]
**> will be a random permutation of x[1],...,x[n] (though
**> different random permutations may look identical if there
**> are equal values in x[1],...,x[n]).
**>
**> > would I be correct if I translate the statement
**> > as follows: it means that the order of sequence, which was
**> > generated from a uniform distribution, would look like a
**> > random normal distribution.
**>
**> No. A normal distribution has nothing to do with it!
**>
**> *Unless* the values x[1],...,x[n] already loooked like values
**> which had already been sampled from a normal distribution (but
**> were, say, in increasing order of size). Then sample(x) would
**> shuffle them into random order so the result could then look
**> like a real sample according ot eh order in which the data
**> came in.
**>
**> Hoping this helps!
**> Ted.
**>
**> --------------------------------------------------------------------
**> E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk>
**> Fax-to-email: +44 (0)870 094 0861
**> Date: 19-Oct-06 Time: 19:34:13
**> ------------------------------ XFMail ------------------------------
**>
*

[[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Oct 21 08:43:41 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Fri 20 Oct 2006 - 23:30:11 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*