[R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'

From: Fox, Aaron <Afox_at_golder.com>
Date: Wed, 11 Jun 2008 11:22:48 -0700


Greetings, all

I am having difficulty getting the fitdistr() function to return without an error on my data. Specifically, what I'm trying to do is get a parameter estimation for fracture intensity data in a well / borehole. Lower bound is 0 (no fractures in the selected data interval), and upper bound is ~ 10 - 50, depending on what scale you are conducting the analysis on.

I read in the data from a text file, convert it to numerics, and then calculate initial estimates of the shape and scale parameters for the gamma distribution from moments. I then feed this back into the fitdistr() function.

R code (to this point):
########################################

data.raw=c(readLines("FSM_C_9m_ENE.inp"))
data.num <- as.numeric(data.raw)
data.num

library(MASS)
shape.mom = ((mean(data.num))/ (sd(data.num))^2 shape.mom
med.data = mean(data.num)
sd.data = sd(data.num)
med.data
sd.data
shape.mom = (med.data/sd.data)^2
shape.mom
scale.mom = (sd.data^2)/med.data
scale.mom

fitdistr(data.num,"gamma",list(shape=shape.mom, scale=scale.mom),lower=0)

fitdistr() returns the following error:

" Error in optim(x = c(0.402707037, 0.403483333, 0.404383704, 2.432626667, :
  L-BFGS-B needs finite values of 'fn'"

Next thing I tried was to manually specify the negative log-likelihood function and pass it straight to mle() (the method specified in Ricci's tutorial on fitting distributions with R). Basically, I got the same result as using fitdistr().

Finally I tried using some R code I found from someone with a similar problem back in 2003 from the archives of this mailing list:

R code
########################################
gamma.param1 <- shape.mom
gamma.param2 <- scale.mom
log.gamma.param1 <- log(gamma.param1)
log.gamma.param2 <- log(gamma.param2)

                                                       gammaLoglik <-
function(params,
negative=TRUE){

   lglk <- sum(dgamma(data, shape=exp(params[1]), scale=exp(params[2]), log=TRUE))

   if(negative)

      return(-lglk)
   else

      return(lglk)
}

optim.list <- optim(c(log.gamma.param1, log.gamma.param2), gammaLoglik)
gamma.param1 <- exp(optim.list$par[1])
gamma.param2 <- exp(optim.list$par[2])

#########################################

If I test this function using my sample data and the estimates of shape and scale derived from the method of moments, gammaLogLike returns as INF. I suspect the problem is that the zeros in the data are causing the optim solver problems when it attempts to minimize the negative log-likelihood function.

Can anyone suggest some advice on a work-around? I have seen suggestions online that a 'censoring' algorithm can allow one to use MLE methods to estimate the gamma distribution for data with zero values (Wilkes, 1990, Journal of Climate). I have not, however, found R code to implement this, and, frankly, am not smart enough to do it myself... :-)

Any suggestions? Has anyone else run up against this and written code to solve the problem?

Thanks in advance!

Aaron Fox
Senior Project Geologist, Golder Associates +1 425 882 5484 || +1 425 736 3958 (mobile) afox_at_golder.com || www.fracturedreservoirs.com



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 14 Jun 2008 - 02:01:37 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 14 Jun 2008 - 04:30:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive