Re: [R] Impaired boxplot functionality - mean instead of median

From: P Ehlers <>
Date: Fri 02 Dec 2005 - 03:57:38 EST

I'd like to add two comments to Martin's sensible response.

  1. I've seen several intro-stats textbooks that define a boxplot to have whiskers to the extreme data values and then define Tukey's boxplot as a "modified" boxplot. I wish authors wouldn't do that.
  2. I've also seen boxplots used for sample sizes as small as -- are you ready for it? -- n = 2!! (Admittedly, only in plots comparing several groups.) The help page for stripchart() points out that stripcharts "are a good alternative to boxplots when sample sizes are small". My own rule-of-thumb: n > 20 for single boxplots, n > 12 for multiple boxplots.

Peter Ehlers

Martin Maechler wrote:

> Boxplots were invented by John W. Tukey and I think should be
> counted among the top "small but smart" achievements from the
> 20th century. Very wisely he did *not* use mean and standard deviations.
> Even though it's possible to draw boxplots that are not boxplots
> (and people only recently explained how to do this with R on this
> mailing list), I'm arguing very strongly against this.
> If I see a boxplot - I'd want it to be a boxplot and not have
> the silly (please excuse) 10%--------90% whiskers which
> declare 20% of the points as outliers {in the boxplot sense}.
> If you want the mean +/- sd plot, do *not* misuse boxplots
> for them, please!
> Martin Maechler, ETH Zurich

>>>>>>"Evgeniy" == Evgeniy Kachalin <>
>>>>>>    on Thu, 01 Dec 2005 19:04:47 +0300 writes:

> Evgeniy> Hello to all users and wizards.
> Evgeniy> I am regulary using 'boxplot' function or its analogue - 'bwplot' from
> Evgeniy> the 'lattice' library.
> [there's the lattice *package* !]
> Evgeniy> But they are, as far as I understand, totally
> Evgeniy> flawed in functionality: they miss ability to select what they would
> Evgeniy> draw 'in the middle' - median, mean. What the box means - standard
> Evgeniy> error, 90% or something else. What the whiskers mean - 100%, 99% or
> Evgeniy> something else.
> Evgeniy> Is there any way to realize it? Or is there any other good data
> Evgeniy> visualization function for comparing means of various data groups?
> Evgeniy> Ideally I would like to have a bit more customised function for doing
> Evgeniy> that. For example, 'boxplot(a~b,data=d,mid='mean').
> Evgeniy> --
> Evgeniy> Evgeniy, ICQ 38317310.
> ______________________________________________
> mailing list
> PLEASE do read the posting guide! mailing list PLEASE do read the posting guide! Received on Fri Dec 02 05:05:14 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:25 EST