From: ONKELINX, Thierry <Thierry.ONKELINX_at_inbo.be>

Date: Wed, 27 Feb 2008 15:25:55 +0100

ir. Thierry Onkelinx

Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest

Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance

Gaverstraat 4

9500 Geraardsbergen

Belgium

tel. + 32 54/436 185

Thierry.Onkelinx_at_inbo.be

www.inbo.be

Date: Wed, 27 Feb 2008 15:25:55 +0100

Chris,

ggplot(mydata, aes(y = VALUE, x = SERIES)) + geom_boxplot() + facet_grid(.~ ID)

2.

Now I think I understand want you want. I'm affraid that won't be easy
because you're trying to mix continuous variables with categorical ones
on the same scale. A density plot has two continuous scales: VALUE and
it's density. The boxplot has a continuous scale (VALUE) and the other
is categorical. Maybe Hadley knows a solution for your problem.

Thierry

ir. Thierry Onkelinx

Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest

Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance

Gaverstraat 4

9500 Geraardsbergen

Belgium

tel. + 32 54/436 185

Thierry.Onkelinx_at_inbo.be

www.inbo.be

Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney

-----Oorspronkelijk bericht-----

Van: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org]
Namens Chris Friedl

Verzonden: woensdag 27 februari 2008 15:08
Aan: r-help_at_r-project.org

Onderwerp: Re: [R] ggplot2 boxplot confusion

Thanks Thierry.

But this leads to a couple more questions if you don't mind.

- I tried to extend your example to a grid by the facet_grid command with the aim of getting a boxplot of VALUE according to two factors SERIES and ID. However whatever syntax I use give me an error. For example:

ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + scale_x_discrete("") +facet_grid(SERIES ~ ID)

Error: position_dodge requires the following missing aesthetics: x

I tried x=c(SERIES, ID) etc etc but they failed.

Yet I know I can get a grid of density plots with qplot as follows:

ggplot(mydata, aes(x = VALUE, y = ..density..)) + geom_density() + facet_grid(ID ~ SERIES)

Yet it doesn't work if I say geom_boxplot.

I hope you can help me understand where I've gone wrong.

2. On your point about overlaying box and density plots, I'm not sure I
understand. I thought a a boxplot is just a particular view of a density
function, showing median, interquartile range etc. The "vertical" scale
is

the same as the density functions "horizontal" scale, isn't it? For
example

in the dummy dataset above:

summary(mydata$VALUE)

Min. 1st Qu. Median Mean 3rd Qu. Max. -2.54400 -0.64690 0.07417 0.08289 0.77830 2.75900

and

ggplot(mydata, aes(x = VALUE, y = ..density..)) + geom_density() shows a
density plot that shows features on the x-axis that are visually close
to

the summary features.

My intent was to plot density because the box plot doesn't reveal shape
details such as multiple modes, and to augment with a narrow boxplot to
show

some density features such as the position of the median, IQR etc.

Or perhaps I've completely misunderstood your point (highly likely I think).

Thanks again for your help. Much appreciated.

ONKELINX, Thierry wrote:

*>
**> Chris,
**>
**> 1.
**>
*

> This code will give you the boxplot that you want.

*>
**> library(ggplot2)
**> series <- c('C2','C4','C8','C10','C15','C20')
**> ids <- c('ID1','ID2','ID3')
**> mydata <-
**> data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180))
**> ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() +
**> scale_x_discrete("")
**>
**> But the real power of ggplot2 is when you want a boxplot for each
**> category:
**>
**> ggplot(mydata, aes(y = VALUE, x = series)) + geom_boxplot()
**>
**>
**> 2.
**> Overlaying boxplots and density plots seems a bad idea to me as both
**> plots are likey to have a different scale.
**>
**> HTH,
**>
**> Thierry
**>
**>
*

-- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15713934.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Wed 27 Feb 2008 - 14:28:24 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 27 Feb 2008 - 23:30:19 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*