# Re: [R] Using R to illustrate the Central Limit Theorem

From: Kevin E. Thorpe <kevin.thorpe_at_utoronto.ca>
Date: Fri 13 May 2005 - 09:03:20 EST

This thread was very timely for me since I will be teaching an introductory biostats course in the fall, including a session on CLT. I have shamelessly taken Dr. Venables' slick piece of code and wrapped it in a function so that I can vary the maximum sample size at will. I then created functions based on the first to sample from the exponential and the chi-squaredistributions. Lastly, I created a function to sample from a pdf with a parabolic shape (confirming in the process just how rusty my calculus is :-) ). My code is below for all to do with as they please.

Now, at the risk of asking a really obvious question ...

In my final function, I used the inverse probability integral transformation to sample from my parabolic distribution. I create a local function rparab shown here:

rparab <- function(nn) {
u <- 2*runif(nn) - 1
ifelse(u<0,-(abs(u)^(1/3)),u^(1/3))
}

Thanks

clt.unif <- function(n=20) {
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(runif(N*k), N, k)) - 0.5)*sqrt(12*k) hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("Uniform(0,1), n = ",k,sep=""), prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon") pu <- par("usr")[1:2]
x <- seq(pu, pu, len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue") abline(0, 1, col = "red")
Sys.sleep(1)
}
}

clt.exp <- function(n=20,theta=1) {
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(rexp(N*k,1/theta), N, k)) - theta)/sqrt(theta^2/k) hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("exp(",theta,"), n = ",k,sep=""), prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon") pu <- par("usr")[1:2]
x <- seq(pu, pu, len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue") abline(0, 1, col = "red")
Sys.sleep(1)
}
}

clt.chisq <- function(n=20,nu=1) {
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(rchisq(N*k,nu), N, k)) - nu)/sqrt(2*nu/k) hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("Chi-Square(",nu,"), n = ",k,sep=""), prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon") pu <- par("usr")[1:2]
x <- seq(pu, pu, len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue") abline(0, 1, col = "red")
Sys.sleep(1)
}
}

clt.parab <- function(n=20) {
rparab <- function(nn) {
u <- 2*runif(nn) - 1
ifelse(u<0,-(abs(u)^(1/3)),u^(1/3))
}
N <- 10000
graphics.off()
par(mfrow = c(1,2), pty = "s")
for(k in 1:n) {
m <- (rowMeans(matrix(rparab(N*k), N, k)))/sqrt(3/(5*k)) hist(m, breaks = "FD", xlim = c(-4,4),
main = paste("n = ",k,sep=""),
prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon") pu <- par("usr")[1:2]
x <- seq(pu, pu, len = 500)
lines(x, dnorm(x), col = "red")
qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue") abline(0, 1, col = "red")
Sys.sleep(1)
}
}

Bill.Venables@csiro.au wrote:

>Here's a bit of a refinement on Ted's first suggestion.
>
>
> N <- 10000
> graphics.off()
> par(mfrow = c(1,2), pty = "s")
> for(k in 1:20) {
> m <- (rowMeans(matrix(runif(M*k), N, k)) - 0.5)*sqrt(12*k)
> hist(m, breaks = "FD", xlim = c(-4,4), main = k,
> prob = TRUE, ylim = c(0,0.5), col = "lemonchiffon")
> pu <- par("usr")[1:2]
> x <- seq(pu, pu, len = 500)
> lines(x, dnorm(x), col = "red")
> qqnorm(m, ylim = c(-4,4), xlim = c(-4,4), pch = ".", col = "blue")
> abline(0, 1, col = "red")
> Sys.sleep(1)
> }
>
>

```--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe@utoronto.ca  Tel: 416.946.8081  Fax: 416.971.2462

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help