Re: [R] vioplot kernel smooth via density kernel smooth

From: Johannes Graumann <johannes_graumann_at_web.de>
Date: Tue, 29 Apr 2008 15:35:22 +0200

<posted & mailed>

Johannes Graumann wrote:

> Vioplots have great appeal to me, as they manage to squeeze so much
> information into so little space ...
> Now some evaluation has made me suspicious about the implementation in the
> package vioplot and I would like to hear what you say about the appended
> results and the code going with it:
>
> library(vioplot)
> mydata <- read.table("data") # file "data" attached
> plot(density(mydata[[1]])) # attached as densityplot.png
> vioplot(mydata[[1]]) # attached as vioplot.png
>
> Default settings for "density" seem to clearly indicate that the
> distribution is bimodal, while "vioplot" total misses that.
> Can "vioplot" be made to react more "density"-like or is there an
> alternative, more "density"-like implementation?
>
> Thanks for your imput, Joh

As a quick hack I patched the "vioplot" function from the package of the same name as appended to use the default "density" function from base. The result when doing the same as described above ("vioplot(mydata[[1]],sm_density=FALSE)")is attached - much more what I would expect for this data set than what the original provides.

Opinions? Ridicule?

Joh

vioplot <- function (x, ..., range = 1.5, h = NULL, ylim = NULL, names = NULL,
    horizontal = FALSE, col = "magenta", border = "black", lty = 1,     lwd = 1, rectCol = "black", colMed = "white", pchMed = 19,     at, add = FALSE, wex = 1, drawRect = TRUE, sm_density=TRUE) {

    datas <- list(x, ...)
    n <- length(datas)
    if (missing(at))

        at <- 1:n
    upper <- vector(mode = "numeric", length = n)     lower <- vector(mode = "numeric", length = n)     q1 <- vector(mode = "numeric", length = n)     q3 <- vector(mode = "numeric", length = n)     med <- vector(mode = "numeric", length = n)     base <- vector(mode = "list", length = n)     height <- vector(mode = "list", length = n)     baserange <- c(Inf, -Inf)
    args <- list(display = "none")
    if (!(is.null(h)))

        args <- c(args, h = h)
    for (i in 1:n) {

        data <- datas[[i]]
        data.min <- min(data)
        data.max <- max(data)
        q1[i] <- quantile(data, 0.25)
        q3[i] <- quantile(data, 0.75)
        med[i] <- median(data)
        iqd <- q3[i] - q1[i]
        upper[i] <- min(q3[i] + range * iqd, data.max)
        lower[i] <- max(q1[i] - range * iqd, data.min)
        est.xlim <- c(min(lower[i], data.min), max(upper[i], 
            data.max))
        if(sm.density==TRUE){
          smout <- do.call("sm.density", c(list(data, xlim = est.xlim), 
              args))
        } else {
          smout <- do.call("density",list(data))
          smout[["estimate"]] <- smout[["y"]]
          smout[["eval.points"]] <- smout[["x"]]
        }
        hscale <- 0.4/max(smout$estimate) * wex
        base[[i]] <- smout$eval.points
        height[[i]] <- smout$estimate * hscale
        t <- range(base[[i]])
        baserange[1] <- min(baserange[1], t[1])
        baserange[2] <- max(baserange[2], t[2])
    }
    if (!add) {
        xlim <- if (n == 1) 
            at + c(-0.5, 0.5)
        else range(at) + min(diff(at))/2 * c(-1, 1)
        if (is.null(ylim)) {
            ylim <- baserange
        }

    }
    if (is.null(names)) {

        label <- 1:n
    }
    else {

        label <- names
    }
    boxwidth <- 0.05 * wex
    if (!add)

        plot.new()
    if (!horizontal) {

        if (!add) {
            plot.window(xlim = xlim, ylim = ylim)
            axis(2)
            axis(1, at = at, label = label)
        }
        box()
        for (i in 1:n) {
            polygon(c(at[i] - height[[i]], rev(at[i] + height[[i]])), 
                c(base[[i]], rev(base[[i]])), col = col, border = border, 
                lty = lty, lwd = lwd)
            if (drawRect) {
                lines(at[c(i, i)], c(lower[i], upper[i]), lwd = lwd, 
                  lty = lty)
                rect(at[i] - boxwidth/2, q1[i], at[i] + boxwidth/2, 
                  q3[i], col = rectCol)
                points(at[i], med[i], pch = pchMed, col = colMed)
            }
        }

    }
    else {
        if (!add) {
            plot.window(xlim = ylim, ylim = xlim)
            axis(1)
            axis(2, at = at, label = label)
        }
        box()
        for (i in 1:n) {
            polygon(c(base[[i]], rev(base[[i]])), c(at[i] - height[[i]], 
                rev(at[i] + height[[i]])), col = col, border = border, 
                lty = lty, lwd = lwd)
            if (drawRect) {
                lines(c(lower[i], upper[i]), at[c(i, i)], lwd = lwd, 
                  lty = lty)
                rect(q1[i], at[i] - boxwidth/2, q3[i], at[i] + 
                  boxwidth/2, col = rectCol)
                points(med[i], at[i], pch = pchMed, col = colMed)
            }
        }

    }
    invisible(list(upper = upper, lower = lower, median = med,

        q1 = q1, q3 = q3))
}



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Tue 29 Apr 2008 - 14:08:55 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 29 Apr 2008 - 14:30:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive