Re: [R] Questions about histograms

From: <Bill.Venables_at_csiro.au>
Date: Mon, 11 Feb 2008 11:38:41 +1000

Andre,

Regarding your first question, it is by no means clear there is anything to fix, in fact I'm sure there is nothing to fix. The fact that the height of any bar is greater than one is irrelevant - the width of the bar is much less than one, as is the product of height by width. Area is height x width, not just height....

Regarding the second question - logarithmic breaks. I'm not aware of anything currently available to do this, but the tools are there for you to do it yourself. The 'breaks' argument to hist allows you to specify your breaks explicitly (among other things) so it's just a matter of setting up the logarithmic (or, more precisely, 'geometric progression') bins yourself and relaying them on to hist.  

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA

Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables_at_csiro.au
http://www.cmis.csiro.au/bill.venables/

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Andre Nathan
Sent: Monday, 11 February 2008 11:14 AM
To: r-help_at_r-project.org
Subject: [R] Questions about histograms

Hello

I'm doing some experiments with the various histogram functions and I have a two questions about the "prob" option and binning.

First, here's a simple plot of my data using the default hist() function:

> hist(data[,1], prob = TRUE, xlim = c(0, 35))

  http://go.sneakymustard.com/tmp/hist.jpg

My first question is regarding the resulting plot from hist.scott() and hist.FD(), from the MASS package. I'm setting prob to TRUE in these functions, but as it can be seen in the images below, the value for the first bar of the histogram is well above 1.0. Shouldn't the total area be 1.0 in the case of prob = TRUE?

> hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/scott.jpg

> hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/FD.jpg

Is there anything I can do to "fix" these plots?

My second question is related to binning. Is there a function or package that allows one to use logarithmic binning in R, that is, create bins such that the length of a bin is a multiple of the length of the one before it?

Pointers to the appropriate docs are welcome, I've been searching for this and couldn't find any info.

Best regards,
Andre



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 11 Feb 2008 - 01:52:36 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Feb 2008 - 02:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive