Re: [Rd] Using sample() to sample one value from a single value?

From: Henrique Dallazuanna <wwwhsd_at_gmail.com>
Date: Wed, 03 Nov 2010 16:02:57 -0200

The resample function in the example section from sample help page does it or not?

On Wed, Nov 3, 2010 at 3:54 PM, Henrik Bengtsson <hb_at_biostat.ucsf.edu>wrote:

> Hi, consider this one as an FYI, or a seed for further discussion.
>
> I am aware that many traps on sample() have been reported over the
> years. I know that these are also documents in help("sample"). Still
> I got bitten by this while writing
>
> sample(units, size=length(units));
>
> where 'units' is an index (positive integer) vector. It works in all
> cases as expected (=I expect) expect for length(units) == 1. I know,
> it is well known. However, it got to make me wonder if it is possible
> to use sample() to draw a single value from a set containing only one
> value. I don't think so, unless you draw from a value that is <= 1.
>
> For instance, you can sample from c(10,10) by doing:
>
> > sample(rep(10, times=2), size=2);
> [1] 10 10
>
> but you cannot sample from c(10) by doing:
>
> > sample(rep(10, times=1), size=1);
> [1] 9
>
> unless you sample from a value <= 1, e.g.
>
> sample(rep(0.31, times=1), size=1);
> [1] 0.31
>
> sample(rep(-10, times=1), size=1);
> [1] -10
>
> Note also the related issue of sampling from a double vector of length 1,
> e.g.
>
> > sample(rep(1.2, times=2), size=2);
> [1] 1.2 1.2
> > sample(rep(1.2, times=1), size=1);
> [1] 1
>
> I the latter case 1.2 is coerced to an integer.
>
> All of the above makes sense when one study the code of sample(), but
> sample() is indeed dangerous, e.g. imagine how many bootstrap
> estimates out there quietly gets incorrect.
>
>
> In order to cover all cases of length(units), including one, a solution is:
>
> sampleFrom <- function(x, size=length(x), ...) {
> n <- length(x);
> if (n == 1L) {
> res <- x;
> } else {
> res <- sample(x, size=size, ...);
> }
> res;
> } # sampleFrom()
>
> > sampleFrom(rep(10, times=2), size=2);
> [1] 10 10
>
> > sampleFrom(rep(10, times=1), size=1);
> [1] 10
>
> > sampleFrom(rep(0.31, times=1), size=1);
> [1] 0.31
>
> > sampleFrom(rep(-10, times=1), size=1);
> [1] -10
>
> > sampleFrom(rep(1.2, times=2), size=2);
> [1] 1.2 1.2
>
> > sampleFrom(rep(1.2, times=1), size=1);
> [1] 1.2
>
>
> I want to add sampleFrom() to the wishlist of functions to be
> available in default R. Alternatively, one can add an argument
> 'sampleFrom=FALSE' to the existing sample() function. Eventually such
> an argument can be made TRUE by default.
>
> /Henrik
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

	[[alternative HTML version deleted]]


______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel

Received on Wed 03 Nov 2010 - 18:05:23 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 04 Nov 2010 - 16:20:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive