[R] Estimate region of highest probabilty density

From: Ort Christoph <Christoph.Ort_at_eawag.ch>
Date: Wed 14 Jun 2006 - 21:24:11 EST


Estimate region of highest probabilty density

Dear R-community

I have data consisting of x and y. To each pair (x,y) a z value (weight) is assigned. With kde2d I can estimate the densities on a regular grid and based on this make a contour plot (not considering the z-values). According to an earlier post in the list I adjusted the kde2d to kde2d.weighted (see code below) to estimate the densities of x and y considering their weights z. There's also a piece of code with artificial data.

Now my question: Does a function exist which calculates the value of the level corresponding to a certain percentage of the volume (i.e. above this level e.g. 95% of the total volume lie, the (parameter) region of highest probability density)?

And secondly: How is it in the case of n parameters, when I don't want to plot it anymore but estimate quantiles for each parameter considering also the weight z (to each set of parameters c(v,w,x,y) a weight z is assigned)?

Many thanks in advance, I am very grateful for any hint Chris

# MASS::kde2d copied and modified
# ===============================

library(MASS)

kde2d.weighted <- function (x, y, w, h, n = n, lims = c(range(x), range(y))) {   nx <- length(x)
  if (length(y) != nx)

      stop("data vectors must be the same length")   gx <- seq(lims[1], lims[2], length = n) # gridpoints x   gy <- seq(lims[3], lims[4], length = n) # gridpoints y   if (missing(h))
    h <- c(bandwidth.nrd(x), bandwidth.nrd(y));   if (missing(w))
    w <- numeric(nx)+1;
  h <- h/4
  ax <- outer(gx, x, "-")/h[1] # distance of each point to each grid point in x-direction   ay <- outer(gy, y, "-")/h[2] # distance of each point to each grid point in y-direction   z <- (matrix(rep(w,n), nrow=n, ncol=nx, byrow=TRUE)*matrix(dnorm(ax), n, nx)) %*% t(matrix(dnorm(ay), n, nx))/(sum(w) * h[1] * h[2]) # z is the density   return(list(x = gx, y = gy, z = z))
}

# Generate artificial data
# ========================

x <- runif(20000,-5,5)
y <- runif(20000,-5,5)
z <- dnorm(x, mean=0, sd=1)*dnorm(y, mean=0, sd=1)

# plot data
# =========

library(Rcmdr)
scatter3d(x=x,z=y,y=z,surface=FALSE,xlab="x",ylab="z",zlab="y",bg.col="black")

temp0 <- kde2d(x=x, y=y, n = 100, lims=c(range(x),range(y))) contour(x=temp0$x, y=temp0$y, z=temp0$z, xlab="x", ylab="y", main="z")

temp <- kde2d.weighted(x=x, y=y, w=z, n = 100, lims=c(range(x),range(y))) contour(x=temp$x, y=temp$y, z=temp$z, xlab="x", ylab="y", main="z", col="red", add=TRUE)

°°°
Christoph Ort
Eawag - aquatic research
Environmental Engineering
Überlandstrasse 133
CH-8600 Dübendorf

phone: +41-44-823-5041          
fax:   +41-44-823-5389
cell:  +41-79-218-9242
 

mailto:christoph.ort@eawag.ch

http://www.eawag.ch/~ortchris/

http://www.eawag.ch

°°°
Christoph Ort
Eawag - aquatic research
Environmental Engineering
Überlandstrasse 133
CH-8600 Dübendorf

phone: +41-44-823-5041          
fax:   +41-44-823-5389
cell:  +41-79-218-9242
 

mailto:christoph.ort@eawag.ch

http://www.eawag.ch/~ortchris/

http://www.eawag.ch



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 14 21:29:06 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 17 Jun 2006 - 16:11:16 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.