Re: [R] problem with hist()

From: hadley wickham <h.wickham_at_gmail.com>
Date: Fri, 15 Jun 2007 09:38:25 +0200

On 6/15/07, Mario Dejung <forum_at_dejung.net> wrote:
> > On 6/14/07, Mario Dejung <forum@dejung.net> wrote:
> >> Hey everybody,
> >> I try to make a graph with two different plots.
> >>
> >>
> >> First I make a boxplot of my data. It is a collection off correlation
> >> values of different pictures. For example:
> >>
> >> 0.23445 pica
> >> 0.34456 pica
> >> 0.45663 pica
> >> 0.98822 picb
> >> 0.12223 picc
> >> 0.34443 picc
> >> etc.
> >>
> >> Ok, I make this boxplot and I get for every picture the boxes. After
> >> this
> >> I want to know, how many correlations per picture exist.
> >> So I make a new vector y <- as.numeric(data$picture)
> >>
> >> So I get for my example something like this:
> >>
> >> y
> >> [1] 1 1 1 1 1 1 1 1 1 1
> >> [11] 1 1 1 1 1 1 1 1 2 2
> >> ...
> >> [16881] 59 59 59 60 60 60 60 60 60 60
> >>
> >> After this I make something like this
> >>
> >> boxplot(cor ~ pic)
> >> par(new = TRUE)
> >> hist(y, nclass = 60)
> >>
> >> But there is my problem. I have 60 pictures, so I get 60 different
> >> boxplots, and I want the hist behind the boxes. But it makes only 59
> >> histbars.
> >>
> >> What can I do? I tried also
> >> hist(y, 1:60) # same effect
> >> and
> >> hist(y, 1:61)
> >> this give me 60 places, but only 59 bars. the last bar is 0.
> >>
> >> I hope anyone can help me.
> >
> > What does the y axis represent? It will be counts for the histogram,
> > and correlations for the boxplots. These aren't comparable, so you're
> > probably better off making two separate graphics.
> >
> > Hadley
> >
> The boxplots show only the median, min, max, etc of the different
> pictures, but I want to know, how many entry's are in this plot. Now I
> have done this by the hist function, and when I use different colors, you
> can see, for the first picture there are about 130 entry, but for the 8th
> picture, there are only 40 entry's...
> Doesn't make this sense?

I think your plot would be more clear if you used two graphics - one showing the spread, and one showing the number of points (you might also want to look at notched boxplots). In the graphic you attached the bars of the barchart (not histogram! - that's for continuous data) distract the eye from the boxplots. You might also want to try ordering the x axis by mean or number of observations as this will make it easier to see trends in the data.

The confusion with the barchart arises because there are really two quite different types of barcharts. One type is basically the same as a dotchart, but you draw bars instead of dots - this is the default in R. The other type is the categorical analog of the histogram, and this is the default in ggplot2
(http://had.co.nz/ggplot2/geom_bar.html), allow the next version will automatically work out which version you want.

Hadley



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 15 Jun 2007 - 07:41:32 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 15 Jun 2007 - 08:32:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.