Re: [R] Estimate region of highest probabilty density

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Sat 17 Jun 2006 - 13:01:26 EST

          I have more sympathy than ideas for you, but I'll attempt to offer the latter as I haven't seen other replies to your post.

          First, if you've got a grid for "contour", you can do various kinds of numerical integration. If I just wanted a rough answer and I didn't have to sell it to anyone, I might compute as fine a grid as I could reasonably do, then cut the grid in two above and below my desired threshold, and do some crude trapezoidal integration in each, use the sum of the two to normalize and be done with it.

          However, if I needed something that would stand closer scrutiny, I'd read what the MASS book says about this. I'd also try "RSiteSearch", which led me to
"http://finzi.psych.upenn.edu/R/Rhelp02a/archive/26898.html", among other things.

          If you can figure out how to get the coefficients of the spline fit, then you can do the desired integration analytically.

          I know this doesn't solve your problem, but I hope it helps.

          Spencer Graves

Christoph wrote:
> Estimate region of highest probabilty density
>
> Dear R-community
>
> I have data consisting of x and y. To each pair (x,y) a z
value (weight) is assigned. With kde2d I can estimate the densities on a regular grid and based on this make a contour plot (not considering the z-values). According to an earlier post in the list I adjusted the kde2d to kde2d.weighted (see code below) to estimate the densities of x and y considering their weights z. There's also a piece of code with artificial data.
>
> Now my question: Does a function exist which calculates the
value of the level corresponding to a certain percentage of the volume (i.e. above this level e.g. 95% of the total volume lie, the (parameter) region of highest probability density)?
>
> And secondly: How is it in the case of n parameters, when I
don't want to plot it anymore but estimate quantiles for each parameter considering also the weight z (to each set of parameters c(v,w,x,y) a weight z is assigned)?
>
> Many thanks in advance, I am very grateful for any hint
> Chris
>
>
>
>
> # MASS::kde2d copied and modified
> # ===============================
>
> library(MASS)
>
> kde2d.weighted <- function (x, y, w, h, n = n, lims = c(range(x), range(y))) {
> nx <- length(x)
> if (length(y) != nx)
> stop("data vectors must be the same length")
> gx <- seq(lims[1], lims[2], length = n) # gridpoints x
> gy <- seq(lims[3], lims[4], length = n) # gridpoints y
> if (missing(h))
> h <- c(bandwidth.nrd(x), bandwidth.nrd(y));
> if (missing(w))
> w <- numeric(nx)+1;
> h <- h/4
> ax <- outer(gx, x, "-")/h[1] # distance of each point to each grid point in x-direction
> ay <- outer(gy, y, "-")/h[2] # distance of each point to each grid point in y-direction
> z <- (matrix(rep(w,n), nrow=n, ncol=nx, byrow=TRUE)*matrix(dnorm(ax), n, nx)) %*% t(matrix(dnorm(ay), n, nx))/(sum(w) * h[1] * h[2]) # z is the density
> return(list(x = gx, y = gy, z = z))
> }
>
>
> # Generate artificial data
> # ========================
>
> x <- runif(20000,-5,5)
> y <- runif(20000,-5,5)
> z <- dnorm(x, mean=0, sd=1)*dnorm(y, mean=0, sd=1)
>
> # plot data
> # =========
>
> library(Rcmdr)
> scatter3d(x=x,z=y,y=z,surface=FALSE,xlab="x",ylab="z",zlab="y",bg.col="black")
>
> temp0 <- kde2d(x=x, y=y, n = 100, lims=c(range(x),range(y)))
> contour(x=temp0$x, y=temp0$y, z=temp0$z, xlab="x", ylab="y", main="z")
>
> temp <- kde2d.weighted(x=x, y=y, w=z, n = 100, lims=c(range(x),range(y)))
> contour(x=temp$x, y=temp$y, z=temp$z, xlab="x", ylab="y", main="z", col="red", add=TRUE)
>
>
>
>
>
> °°°
> Christoph Ort
> Eawag - aquatic research
> Environmental Engineering
> Überlandstrasse 133
> CH-8600 Dübendorf
>
> phone: +41-44-823-5041
> fax: +41-44-823-5389
> cell: +41-79-218-9242
>
> mailto:christoph.ort@eawag.ch
>
> http://www.eawag.ch/~ortchris/
>
> http://www.eawag.ch
>
>
>
>
> °°°
> Christoph Ort
> Eawag - aquatic research
> Environmental Engineering
> Überlandstrasse 133
> CH-8600 Dübendorf
>
> phone: +41-44-823-5041
> fax: +41-44-823-5389
> cell: +41-79-218-9242
>
> mailto:christoph.ort@eawag.ch
>
> http://www.eawag.ch/~ortchris/
>
> http://www.eawag.ch
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jun 17 14:16:56 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 17 Jun 2006 - 18:11:01 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.