Re: [R] Sampling

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Wed, 06 Feb 2008 20:43:41 +0100

Tim Hesterberg wrote:
>> values <- sapply(1:1000, function(i) sample(1:3, size=2, prob = c(.5, .25, .25)))
>> table(values)
>>
> values
> 1 2 3
> 834 574 592
>
> The selection probabilities are not proportional to the specified
> probabilities.
>
> In contrast, in S-PLUS:
>
>> values <- sapply(1:1000, function(i) sample(1:3, size=2, prob = c(.5, .25, .25)))
>> table(values)
>>
> 1 2 3
> 1000 501 499
>
>
But is that the right thing? If you can use prob=c(.6, .2, .2) and get 1200 - 400 - 400, then I'm not going to play poker with you....

The interpretation in R is that you are dealing with "fat cards", i.e. card 1 is twice as thick as the other two, so there is 50% chance of getting the _first_ card as a 1 and additionally, (.25+.25)*2/3 to get the 2nd card as a 1 for a total of .8333. And since the two cards are different the expected number of occurrences of card 1 in 1000 samples is 833. What is the interpretation in S-PLUS?

-- 
   O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 06 Feb 2008 - 19:46:54 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 06 Feb 2008 - 23:30:13 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive