Re: R-beta: CI for median in funtion boxplot

Thomas Lumley (thomas@biostat.washington.edu)
Sat, 4 Apr 1998 08:42:41 -0800 (PST)


Date: Sat, 4 Apr 1998 08:42:41 -0800 (PST)
From: Thomas Lumley <thomas@biostat.washington.edu>
To: Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk>
Subject: Re: R-beta: CI for median in funtion boxplot
In-Reply-To: <x2lntlsvip.fsf@blueberry.kubism.ku.dk>

On 4 Apr 1998, Peter Dalgaard BSA wrote:

> Rick White <rick@stat.ubc.ca> writes:
> 
> > 
> > I noticed that boxplot computes a 95% CI for the median by using
> > median +/- 1.58*IQR./sqrt(n)
> > 
> > Where does the 1.58 constant come from?
> > 
> 
> Search me... However, wouldn't it be better in any case to do an exact
> 95% CI based on the binomial distribution? Of course, you need at
> least 6 observations to do that.
> 

I think 1.58 is based on a Normal approximation.  If the data are Normal
then you can compute the asymptotic standard error of the median and use
+/- 1.96 of these to get a 95% ci.  1.58*IQR/sqrt(n) is a robust estimate
of 2.13sigma/sqrt(n), which is about right to be 1.96 standard errors.

This will work for contaminated Normal distributions, but it won'tbe very
good for genuinely long-tailed or asymmetric distributions.  In any case,
the distribution of the median converges to Normal rather slowly, so the
CI might not be very good anyway except in large samples.   The exact
binomial CI would be much better.

Thomas Lumley
-----------------------
Biostatistics	
Uni of Washington	
Box 357232		
Seattle WA 98195-7232	
------------------------



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._