Re: [Rd] too-large notches in boxplot (PR #7690)

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Tue 07 Nov 2006 - 10:35:54 GMT

>>>>> "Ben" == Ben Bolker <bolker@zoo.ufl.edu>
>>>>> on Mon, 23 Jan 2006 14:37:18 -0500 writes:

    Ben> PR #7690 points out that if the confidence intervals (+/-1.58 
    Ben> IQR/sqrt(n)) in a boxplot with notch=TRUE are larger than the
    Ben> hinges -- which is most likely to happen for small n and asymmetric
    Ben> distributions -- the resulting plot is ugly, e.g.:

     set.seed(1001)
     npts <- 5
     X <- rnorm(2*npts,rep(3:4,each=npts),sd=1)
     f <- factor(rep(1:2,each=npts))
     boxplot(X~f)
     boxplot(X~f,notch=TRUE)

    Ben> I can imagine debate about what should be done in this case --
    Ben> you could just say "don't do that", since the notches are based
    Ben> on an asymptotic argument ... the diff below just truncates
    Ben> the notches to the hinges, but produces a warning saying that the     Ben> notches have been truncated.

    Ben> ?? what should the behavior be ??

And this has been mentioned again more recently (than January!) and IIRC I'd argued that the plotting behavior should not be changed, because of back-compatibility and "you get what you deserve" etc

OTOH, users should at least notice that something "unusual" happens,
and I have used part of Ben's proposed patch to simply issue a warning when the notches go beyond the hinges i.e. out side the "box" of the boxplot.

 new>> Warning message:
 new>> some notches went outside hinges ('box'): maybe set notch=FALSE

I hope that this helps all those who where puzzled by examples like the one above.

Martin Maechler, ETH Zurich

with thanks to Ben for his perseverance (:-)

    Ben> the diff is against the 11 Jan version of R 2.3.0

    Ben> *** newboxplot.R        2006-01-23 14:32:12.000000000 -0500
    Ben> --- oldboxplot.R        2006-01-23 14:29:29.000000000 -0500
    Ben> ***************
    Ben> *** 84,98 ****
    Ben> bplt <- function(x, wid, stats, out, conf, notch, xlog, i)
    Ben> {
    Ben> ## Draw single box plot
    Ben> -       conf.ok <- TRUE
    Ben> -       if(!any(is.na(stats))) {
    Ben> -           ## check for overlap of notches and hinges
    Ben> -           if (notch && (stats[2]>conf[1] || stats[4]<conf[2])) {
    Ben> -              conf.ok <- FALSE
    Ben> -              conf[1] <- max(conf[1],stats[2])
    Ben> -              conf[2] <- min(conf[2],stats[4])
    Ben> -            }

    Ben> ## stats = +/- Inf: polygon & segments should handle

    Ben> ## Compute 'x + w' -- "correctly" in log-coord. case:
    Ben> --- 84,91 ----
    Ben> bplt <- function(x, wid, stats, out, conf, notch, xlog, i)
    Ben> {
    Ben> ## Draw single box plot

    Ben> +       if(!any(is.na(stats))) {
    Ben> ## stats = +/- Inf: polygon & segments should handle
    Ben> ## Compute 'x + w' -- "correctly" in log-coord. case:
    Ben> ***************
    Ben> *** 148,154 ****
    Ben> domain = NA)
    Ben> }
    Ben> }
    Ben> -       return(conf.ok)
    Ben> } ## bplt

    Ben> if(!is.list(z) || 0 == (n <- length(z$n)))
    Ben> --- 141,146 ----
    Ben> ***************
    Ben> *** 239,252 ****

    Ben> xysegments <- segments
    Ben> }
    Ben> -     conf.ok <- numeric(n)
    Ben> for(i in 1:n)
    Ben> !       conf.ok[i] <- bplt(at[i], wid=width[i],
    Ben> stats= z$stats[,i],
    Ben> out  = z$out[z$group==i],
    Ben> conf = z$conf[,i],
    Ben> notch= notch, xlog = xlog, i = i)
    Ben> !     if (any(!conf.ok)) warning("some confidence limits > hinges: 
    Ben> notches truncated")
    Ben> axes <- is.null(pars$axes)
    Ben> if(!axes) { axes <- pars$axes; pars$axes <- NULL }
    Ben> if(axes) {
    Ben> --- 231,243 ----
    Ben> xysegments <- segments
    Ben> }

    Ben> for(i in 1:n)
    Ben> !       bplt(at[i], wid=width[i],
    Ben> stats= z$stats[,i],
    Ben> out  = z$out[z$group==i],
    Ben> conf = z$conf[,i],
    Ben> notch= notch, xlog = xlog, i = i)
    Ben> !
    Ben> axes <- is.null(pars$axes)

    Ben> if(!axes) { axes <- pars$axes; pars$axes <- NULL }     Ben> if(axes) {
    Ben> -- 
    Ben> 620B Bartram Hall                            bolker@zoo.ufl.edu
    Ben> Zoology Department, University of Florida    http://www.zoo.ufl.edu/bolker
    Ben> Box 118525                                   (ph)  352-392-5697
    Ben> Gainesville, FL 32611-8525                   (fax) 352-392-3704

    Ben> ______________________________________________
    Ben> R-devel@r-project.org mailing list     Ben> https://stat.ethz.ch/mailman/listinfo/r-devel

R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue Nov 07 21:42:03 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 07 Nov 2006 - 11:30:36 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.