From: Henrik Bengtsson <hb_at_biostat.ucsf.edu>

Date: Wed, 03 Nov 2010 10:54:18 -0700

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 03 Nov 2010 - 18:00:17 GMT

Date: Wed, 03 Nov 2010 10:54:18 -0700

Hi, consider this one as an FYI, or a seed for further discussion.

sample(units, size=length(units));

where 'units' is an index (positive integer) vector. It works in all cases as expected (=I expect) expect for length(units) == 1. I know, it is well known. However, it got to make me wonder if it is possible to use sample() to draw a single value from a set containing only one value. I don't think so, unless you draw from a value that is <= 1.

For instance, you can sample from c(10,10) by doing:

> sample(rep(10, times=2), size=2);

[1] 10 10

but you cannot sample from c(10) by doing:

> sample(rep(10, times=1), size=1);

[1] 9

unless you sample from a value <= 1, e.g.

sample(rep(0.31, times=1), size=1);

[1] 0.31

sample(rep(-10, times=1), size=1);

[1] -10

Note also the related issue of sampling from a double vector of length 1, e.g.

> sample(rep(1.2, times=2), size=2);

[1] 1.2 1.2

> sample(rep(1.2, times=1), size=1);

[1] 1

I the latter case 1.2 is coerced to an integer.

All of the above makes sense when one study the code of sample(), but sample() is indeed dangerous, e.g. imagine how many bootstrap estimates out there quietly gets incorrect.

In order to cover all cases of length(units), including one, a solution is:

sampleFrom <- function(x, size=length(x), ...) {
n <- length(x);

if (n == 1L) {

res <- x;

} else {

res <- sample(x, size=size, ...);

}

res;

} # sampleFrom()

> sampleFrom(rep(10, times=2), size=2);

[1] 10 10

> sampleFrom(rep(10, times=1), size=1);

[1] 10

> sampleFrom(rep(0.31, times=1), size=1);

[1] 0.31

> sampleFrom(rep(-10, times=1), size=1);

[1] -10

> sampleFrom(rep(1.2, times=2), size=2);

[1] 1.2 1.2

> sampleFrom(rep(1.2, times=1), size=1);

[1] 1.2

I want to add sampleFrom() to the wishlist of functions to be available in default R. Alternatively, one can add an argument 'sampleFrom=FALSE' to the existing sample() function. Eventually such an argument can be made TRUE by default.

/Henrik

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 03 Nov 2010 - 18:00:17 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 03 Nov 2010 - 18:20:16 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*