# Re: [R] Bootstrap direction

From: Ben Bolker <bolker_at_zoo.ufl.edu>
Date: Thu 02 Jun 2005 - 06:29:00 EST

Dan Janes <djanes <at> oeb.harvard.edu> writes:

>
> Hi all,
> I am trying to bootstrap a small data set into 1000 "pseudodatasets" and
> then run an ANOVA on each one. Can anyone provide guidance on how I could
> do this?
>
> Thank you.
>
> -Dan Janes
>
> ************************************************
> Dan Janes, Ph.D.
> Harvard University/OEB
> 26 Oxford St.
> Cambridge, MA 02138
> Office: 617-496-2375
> Fax: 617-495-5667
> Email: djanes <at> oeb.harvard.edu
>
> ______________________________________________
> R-help <at> stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
>
>

There are several bootstrap libraries (boot, bootstrap, simpleboot) available on CRAN, but I think the following will also do what you want:

## make up some data
X <- data.frame(z=rnorm(20),f=factor(rep(1:5,each=4)))

## linear fit with true values
lm0 <- lm(z~f,data=X)
Fvals <- numeric(1000)
for (i in 1:1000) {

X.boot <- X[sample(1:nrow(X),replace=TRUE),]    Fvals[i] <- anova(lm(z~f,data=X.boot))\$"F value"[1] }
hist(Fvals)

the only part that's at all tricky is figuring out how to extract the F value (which is what I'm guessing you want) from the anova() of each bootstrapped data set.

One complicating issue is that this doesn't stratify (so it doesn't preserve the experimental design -- in this case that there are 4 samples per factor level).

If you replace the stuff in the for loop with this:

z.boot <- unlist(lapply(split(X\$z,X\$f),sample,replace=TRUE))    Fvals[i] <- anova(lm(z.boot~X\$f))\$"F value"[1]

I think it will stratify, at the cost of a bit more black magic. If you're going to do more of this you should probably look into the packages (and possibly the associated books).   Of course, you should check all of this carefully before you believe it.

cheers, Ben

R-help@stat.math.ethz.ch mailing list