Re: [R] Impaired boxplot functionality - mean instead of median

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Fri 02 Dec 2005 - 18:36:02 EST

  {diverted back to R-help}

There are several R packages that provide plots of "mean +/- SD" (or "mean +/- 2*SD" which is an approximate 95% confidence interval for the case of normally distributed data) or so called "error bars".

E.g. function plotCI() in package 'gplots' and errbar() in package 'Hmisc' or 'sfsmisc'.

I'm very convinced that boxplots shouldn't be (mis!)used for drawing those (and they are not by the above functions).

Regards,
Martin

>>>>> "Evgeniy" == Evgeniy Kachalin <ka4alin@yandex.ru> >>>>> on Thu, 01 Dec 2005 19:39:18 +0300 writes:

    Evgeniy> Martin Maechler пишет:
>> Boxplots were invented by John W. Tukey and I think should be
>> counted among the top "small but smart" achievements from the
>> 20th century. Very wisely he did *not* use mean and standard deviations.
>>
>> Even though it's possible to draw boxplots that are not boxplots
>> (and people only recently explained how to do this with R on this
>> mailing list), I'm arguing very strongly against this.
>>
>> If I see a boxplot - I'd want it to be a boxplot and not have
>> the silly (please excuse) 10%--------90% whiskers which
>> declare 20% of the points as outliers {in the boxplot sense}.
>>
>> If you want the mean +/- sd plot, do *not* misuse boxplots
>> for them, please!
>>

    Evgeniy> So I analize genetics data. I have some factor
    Evgeniy> (gene variant, c(1,2,3)) and the quantitative
    Evgeniy> variable corresponding to that factor. How do I
    Evgeniy> visualize this situation? Compare mean of samples
    Evgeniy> corresponding to factor values?

    Evgeniy> Should boxplot support 'mean-in-the-middle', it
    Evgeniy> would fit my needs ideally. How do I plot mean +/-     Evgeniy> SD plot?
    Evgeniy> Also there is a way to rewrite boxplot.stats and
    Evgeniy> replace "fivenum" there for self-made
    Evgeniy> function. Then I would need to write self-made
    Evgeniy> boxplot.formula (or boxplot.default?) function. And
    Evgeniy> all this stuff would not be configurable. I'm still
    Evgeniy> novice in R, so I need simple way to pre-visualize
    Evgeniy> my data and estimate approximate result.

yes, there are ways, but no, I pretty strongly oppose the idea to misuse the boxplot graphics for depicting very different identities.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 02 18:41:42 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:26 EST