Re: [R] Doubt about Student t distribution simulation

From: John Fox <jfox_at_mcmaster.ca>
Date: Sat 05 Aug 2006 - 06:18:14 EST


Dear Jose,

The problem is that you're using the population standard deviation (sigma) rather than the sample SD of each sample [i.e., t[i] = (mean(amo.i) - mu) / (sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as they appear to be.

A couple of smaller points: (1) Even after this correction, you're sampling from a discrete population (albeit with replacement) and so the values won't be exactly t-distributed. You could draw the samples directly from N(mu, sigma) instead. (2) It would be preferable to make a quantile-comparison plot against the t-distribution, since you'd get a better picture of what's going on in the tails.

I hope this helps,
 John



John Fox
Department of Sociology
McMaster University
Hamilton, Ontario

Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Jose
> Claudio Faria
> Sent: Friday, August 04, 2006 3:09 PM
> To: R-help@stat.math.ethz.ch
> Subject: [R] Doubt about Student t distribution simulation
>
> Dear R list,
>
> I would like to illustrate the origin of the Student t
> distribution using R.
>
> So, if (sample.mean - pop.mean) / standard.error(sample.mean)
> has t distribution with (sample.size - 1) degree free, what
> is wrong with the simulation below? I think that the
> theoretical curve should agree with the relative frequencies
> of the t values calculated:
>
> #== begin options=====
> # parameters
> mu = 10
> sigma = 5
>
> # size of sample
> n = 3
>
> # repetitions
> nsim = 10000
>
> # histogram parameter
> nchist = 150
> #== end options=======
>
> t = numeric()
> pop = rnorm(10000, mean = mu, sd = sigma)
>
> for (i in 1:nsim) {
> amo.i = sample(pop, n, replace = TRUE)
> t[i] = (mean(amo.i) - mu) / (sigma / sqrt(n)) }
>
> win.graph(w = 5, h = 7)
> split.screen(c(2,1))
> screen(1)
> hist(t,
> main = "histogram",
> breaks = nchist,
> col = "lightgray",
> xlab = '', ylab = "Fi",
> font.lab = 2, font = 2)
>
> screen(2)
> hist(t,
> probability = T,
> main = 'f.d.p and histogram',
> breaks = nchist,
> col = 'lightgray',
> xlab = 't', ylab = 'f(t)',
> font.lab = 2, font = 2)
>
> x = t
> curve(dt(x, df = n-1), add = T, col = "red", lwd = 2)
>
> Many thanks for any help,
> ___
> Jose Claudio Faria
> Brasil/Bahia/Ilheus/UESC/DCET
> Estatística Experimental/Prof. Adjunto
> mails: joseclaudio.faria@terra.com.br
> jc_faria@uesc.br
> jc_faria@uol.com.br
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Aug 05 06:21:20 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 05 Aug 2006 - 20:17:52 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.