Re: [R] a question about box counting

From: Ray Brownrigg <ray_at_mcs.vuw.ac.nz>
Date: Tue 05 Apr 2005 - 07:27:56 EST


> From: Deepayan Sarkar <deepayan@stat.wisc.edu> Mon, 4 Apr 2005 13:52:48 -0500
>
> On Monday 04 April 2005 13:22, Rajarshi Guha wrote:
> > Hi,
> > I have a set of x,y data points and each data point lies between
> > (0,0) and (1,1). Of this set I have selected all those that lie in
> > the lower triangle (of the plot of these points).
> >
> > What I would like to do is to divide the region (0,0) to (1,1) into
> > cells of say, side = 0.01 and then count the number of cells that
> > contain a point.

> >
> > My first approach is to generate the coordinates of these cells and
> > then loop over the point list to see whether a point lies in a cell
> > or not.
> >
> > However this seems to be very inefficient esepcially since I will
> > have 1000's of points.
> >
> > Has anybody dealt with this type of problem and are there routines to
> > handle it?
>
> A combination of cut and table/xtabs should do it, e.g.:
>
>
> x <- runif(3000)
> y <- runif(3000)
>
> fx <- cut(x, breaks = seq(0, 1, length = 101))
> fy <- cut(y, breaks = seq(0, 1, length = 101))
>
> txy <- xtabs(~ fx + fy)
> :

Another significantly faster way (but not generating row/column names) is:
x <- runif(3000)
y <- runif(3000)
ints <- 100
myfun <- function(x, y, ints) {
  fx <- x %/% (1/ints)
  fy <- y %/% (1/ints)
  txy <- hist(fx + ints*fy+ 1, breaks=0:(ints*ints), plot=FALSE)$counts   dim(fxy) <- c(ints, ints)
  return(txy)
}
myfun(x, y, ints)

Hope this helps,
Ray Brownrigg



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Apr 05 07:33:31 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:01 EST