Re: [R] Stratified Bootstrap question

From: Tim Hesterberg <timh_at_insightful.com>
Date: Sat 02 Apr 2005 - 09:17:05 EST

Qian wrote:
>I talked with my advisor yesterday about how to do bootstrapping for my
>scenario: random clinic + random subject within clinic. She suggested that
>only clinic are independent units, so I can only resample clinic. But I
>think that since subjects are also independent within clinic, shall I
>resample subjects within clinic, which means I have two-stage resampling?
>Which one do you think makes sense?

This is a tough issue; I don't have a complete answer. I'd appreciate input from other r-help readers.

If you randomly select clinics, then randomly select patients within the clinics:
  (1) by bootstrapping just clinics, you capture both sources of   variation -- the between-subject variation is incorporated in the   results for each clinic.   

  (2) by bootstrapping clinics, then subjects within clinics, you   end up double-counting the between-subject variation That argues for resampling just clinics.

By analogy, if you have multiple subjects, and multiple measurements per subject, you should just resample subjects.

However, I'm not comfortable with this if you have a small number of clinics, and relatively large numbers of patients in each clinic, and think that the between-clinic variation should be small. Then it seems better to resample both clinics and patients.

I'm leery about resampling just clinics if there are a small number of clinics. Bootstrapping isn't particularly effective for small samples -- it is subject to skewness in small samples, and it underestimates variances (it's advantages over classical methods really show up with medium size samples). There are remedies for the small variance, see

	Hesterberg, Tim C. (2004), "Unbiasing the Bootstrap-Bootknife Sampling
	vs. Smoothing", Proceedings of the Section on Statistics and the
	Environment, American Statistical Association, 2924-2930
	www.insightful.com/Hesterberg/articles/JSM04-bootknife.pdf

Tim Hesterberg


| Tim Hesterberg       Research Scientist              |
| timh@insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Download the S+Resample library from www.insightful.com/downloads/libraries

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Apr 02 09:29:41 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:00 EST