Re: [R] histogram frequency weighing

From: jim holtman <jholtman_at_gmail.com>
Date: Sun 17 Sep 2006 - 22:05:15 GMT

I think this should do it:

> lenh <- hist(iris$Sepal.Length, br=seq(4, 8, 0.05))$counts
> lenh # original data
 [1] 0 0 0 0 0 1 0 3 0 1 0 4 0 2 0 5 0 6 0 10 0 9  0 4 0 1 0 6 0 7 0 6 0
[34] 8 0 7 0 3 0 6 0 6 0 4 0 9 0 7 0 5 0 2 0 8 0  3 0 4 0 1 0 1 0 3 0 1
[67] 0 1 0 0 0 1 0 4 0 0 0 1 0 0
> l.rle <- rle(lenh)
> # determine where '0's are
> Zero <- which(l.rle$values == 0)
> # if last entry in rle was 0, delete from offsets since we are changing +1
> if (tail(l.rle$values,1) == 0) Zero <- Zero[-length(Zero)]
> l.offsets <- cumsum(l.rle$lengths) # offsets into original vector# modify original input
> lenh[l.offsets[Zero+1]] <- lenh[l.offsets[Zero + 1]] / (l.rle$lengths[Zero]+1)
> lenh # modified data

 [1] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.1666667 0.0000000 1.5000000 0.0000000 0.5000000
[11] 0.0000000 2.0000000 0.0000000 1.0000000 0.0000000 2.5000000 0.0000000 3.0000000 0.0000000 5.0000000
[21] 0.0000000 4.5000000 0.0000000 2.0000000 0.0000000 0.5000000 0.0000000 3.0000000 0.0000000 3.5000000
[31] 0.0000000 3.0000000 0.0000000 4.0000000 0.0000000 3.5000000 0.0000000 1.5000000 0.0000000 3.0000000
[41] 0.0000000 3.0000000 0.0000000 2.0000000 0.0000000 4.5000000 0.0000000 3.5000000 0.0000000 2.5000000
[51] 0.0000000 1.0000000 0.0000000 4.0000000 0.0000000 1.5000000 0.0000000 2.0000000 0.0000000 0.5000000
[61] 0.0000000 0.5000000 0.0000000 1.5000000 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000 0.0000000
[71] 0.0000000 0.2500000 0.0000000 2.0000000 0.0000000 0.0000000 0.0000000 0.2500000 0.0000000 0.0000000
>
>

On 9/17/06, Sebastian P. Luque <spluque@gmail.com> wrote:
> Fellow R-helpers,
>
> Suppose we create a histogram as follows (although it could be any vector
> with zeroes in it):
>
>
> R> lenh <- hist(iris$Sepal.Length, br=seq(4, 8, 0.05))
> R> lenh$counts
> [1] 0 0 0 0 0 1 0 3 0 1 0 4 0 2 0 5 0 6 0 10 0 9 0 4 0
> [26] 1 0 6 0 7 0 6 0 8 0 7 0 3 0 6 0 6 0 4 0 9 0 7 0 5
> [51] 0 2 0 8 0 3 0 4 0 1 0 1 0 3 0 1 0 1 0 0 0 1 0 4 0
> [76] 0 0 1 0 0
>
>
> and we wanted to apply a weighing scheme where frequencies immediately
> following (and only those) empty class intervals (0) should be adjusted by
> averaging them over the number of preceding empty intervals + 1. For
> example, the first frequency that would need to be adjusted in 'lenh' is
> element 6 (1), which has 5 preceding empty intervals, so its adjusted
> count would be 1/6. Similarly, the second one would be element 8 (3),
> which has 1 preceding empty interval, so its adjusted count would be 3/2.
> Can somebody please provide a hint to implement such a weighing scheme?
>
> I thought about some very contrived ways to accomplish this, involving
> 'which' and 'diff', but I sense a function might already be available to
> do this efficiently. I couldn't find relevant info in the usual channels.
> Thanks in advance for any pointers.
>
>
> Cheers,
>
> --
> Seb
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon Sep 18 08:09:18 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 17 Sep 2006 - 23:30:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.