From: Charles Annis, P.E. <Charles.Annis_at_statisticalengineering.com>

Date: Wed, 28 May 2008 10:48:42 -0400

From: Philip Twumasi-Ankrah [mailto:nana_kwadwo_derkyi_at_yahoo.com] Sent: Wednesday, May 28, 2008 10:36 AM

To: Charles.Annis_at_StatisticalEngineering.com Subject: RE: [R] "rbinom" : Does randomness preclude precision?

E-Mail: (Ted Harding)

Fax-to-email: +44 (0)870 094 0861

Date: 28-May-08 Time: 14:19:24

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 28 May 2008 - 18:26:24 GMT

Date: Wed, 28 May 2008 10:48:42 -0400

I think I see the rub: You would like to see the distribution of a sample be identical to the distribution from which it was sampled. But if it is random then that can happen only in the long run, not on every sample. That is why samples from a normal density are *not* themselves normal - they're "t." When the sample size is large enough the differences between a random sample's density and its parent density become vanishingly small. Thus the differences you observe from repeated random samples from the binomial. Repeated sampling produces slightly different numbers of successes. How could it be otherwise?

Charles Annis, P.E.

Charles.Annis_at_StatisticalEngineering.com
phone: 561-352-9699

eFax: 614-455-3265

http://www.StatisticalEngineering.com

From: Philip Twumasi-Ankrah [mailto:nana_kwadwo_derkyi_at_yahoo.com] Sent: Wednesday, May 28, 2008 10:36 AM

To: Charles.Annis_at_StatisticalEngineering.com Subject: RE: [R] "rbinom" : Does randomness preclude precision?

Charles,

When you simulate data from a distribution, what you effect are doing is
generating a sequence of values that would correspond to that distribution.
So you can generate 1000 values from a normal distribution and expect that
when you check on the distribution of your sample (what you do with your
qqnorm or Q-Q plot), it should be a close fit with the theoretical
distribution with the assigned parameter values. It will be difficult to
explain why a simulated data may be different from the distribution it is
was generated from . I think you can not blame it on randomness.

I hope you understand what I am trying to determine.

"Charles Annis, P.E." <Charles.Annis_at_StatisticalEngineering.com> wrote: What do you mean by "... *eventual* nature of the distribution?" If you simulated 100 samples, would you expect to see 1.5 successes? Or 1? Or 2? How many, in your thinking, is "eventual?"

Charles Annis, P.E.

Charles.Annis_at_StatisticalEngineering.com
phone: 561-352-9699

eFax: 614-455-3265

http://www.StatisticalEngineering.com

-----Original Message-----

From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On
Behalf Of Philip Twumasi-Ankrah

Sent: Wednesday, May 28, 2008 9:52 AM

To: ted.harding_at_manchester.ac.uk

Cc: r-help_at_r-project.org

Subject: Re: [R] "rbinom" : Does randomness preclude precision?

Teds reply is a bit comforting and as indicated in my post, I am resorting to using "sample" but as an academic issue, does randomness preclude precision?

Randomness should be in the sequence of zeros and ones and how they are simulated at each iteration of the process but not in the eventual nature of the distribution.

Ted.Harding_at_manchester.ac.uk wrote: On 28-May-08 12:53:26, Philip
Twumasi-Ankrah wrote:

> I am trying to simulate a series of ones and zeros (1 or 0) and I am

*> using "rbinom" but realizing that the number of successes expected is
**> not accurate. Any advice out there.
**>
**> This is the example:
**>
**> N<-500
**> status<-rbinom(N, 1, prob = 0.15)
**> count<-sum(status)
**>
**> 15 percent of 500 should be 75 but what I obtain from the "count"
**> variable is 77 that gives the probability of success to be 0.154. Not
**> very good.
*

The difference (77 - 75 =2) is well within the likely sampling variation when 500 values are sampled independently with P(1)=0.15:

The standard deviation of the resulting number of 1s is sqrt(500*0.15*0.85) = 7.98, so the difference of 2 is only 1/4 of a standard deviation, hence very likely to be equalled or exceeded.

Your chance of getting exactly 75 by this method is quite small:

dbinom(75,500,0.15)

[1] 0.04990852

and your chance of being 2 or more off your target is

1 - sum(dbinom((74:76),500,0.15))

[1] 0.8510483

> Is there another way beyond using "sample" and "rep" together?

It looks as though you are seeking to obtain exactly 75 1s, randomly situated, the rest being 0s, so in effect you do need to do something on the lines of "sample" and "rep". Hence, something like

status <- rep(0,500)

status[sample((1:500),75,replace=FALSE)] <- 1

Hoping this helps,

Ted.

E-Mail: (Ted Harding)

Fax-to-email: +44 (0)870 094 0861

Date: 28-May-08 Time: 14:19:24

------------------------------ XFMail ------------------------------

A Smile costs Nothing

But Rewards Everything

Happiness is not perfected until it is shared -Jane Porter

[[alternative HTML version deleted]]

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

A Smile costs Nothing

But Rewards Everything

Happiness is not perfected until it is shared

-Jane Porter ______________________________________________R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 28 May 2008 - 18:26:24 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 28 May 2008 - 18:30:41 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*