[Rd] hist.default()$density

From: Martin Becker <martin.becker_at_mx.uni-saarland.de>
Date: Tue, 30 Mar 2010 17:17:36 +0200

Dear developers,

the current implementation of hist.default() calculates 'density' (and 'intensities') as
  dens <- counts/(n*h)
where h has been calculated before as
  h <- diff(fuzzybreaks)
which results in 'fuzzy' values for the density, see e.g.

> tmp <- hist(1:10,breaks=c(-2.5,2.5,7.5,12.5),plot=FALSE)
> print(tmp$density,digits=15)

[1] 0.0399999920000016 0.1000000000000000 0.0600000000000000

Since hist.default()$breaks are not the fuzzy breaks used for the calculation of dens, the sum of the bins' area is significantly different from 1 in many cases, see e.g.

> print(sum(tmp$density*diff(tmp$breaks)),digits=15)
[1] 0.999999960000008

Is this intended, or should the calculation of dens read   dens <- counts/(n*diff(breaks))
instead (or should hist.default()$breaks return the fuzzy breaks)?

Best wishes


Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 206
66123 Saarbruecken

R-devel_at_r-project.org mailing list
Received on Wed 31 Mar 2010 - 08:55:23 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 31 Mar 2010 - 10:11:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive