# Re: [R] fitting a lognormal distribution using cumulative probabilities

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Fri, 22 Feb 2008 17:35:29 +0000 (GMT)

On Sat, 23 Feb 2008, ahimsa campos-arceiz wrote:

> Dear all,
>
> I'm trying to estimate the parameters of a lognormal distribution fitted
> from some data.
>
> The tricky thing is that my data represent the time at which I recorded
> certain events. However, in many cases I don't really know when the event
> happened. I' only know the time at which I recorded it as already happened.

So this is a rather extreme form of censoring.

> Therefore I want to fit the lognormal from the cumulative distribution
> function (cdf) rather than from the probability distribution function (pdf).
>
> My understanding is that methods based on Maximum Likelihood (e.g. fitdistr
> {MASS}) are based on the pdf. Nonlinear least-squares methods seem to be
> based on the cdf... however I was unable to use nls{stat} for lognormal.

Not so: ML fitting can be done for censored data. However, I don't think you have a valid description here: it seems you never recorded a time at which the event had not happened, and the most likely fit is a probability mass at zero (since this is a perfect explanation for your data).

To make any progress with censoring, you need to see both positive and negative events. If you told us that none of these events happened before t=15, it would be possible to fit the model (although you would need far more data to get a good fit).

Generally code to handle censoring is in survival analysis: e.g. survreg() in package survival. In the terminiology of the latter, all your observations are left-censored.

> I found a website that explains how to fit univariate distribution functions
> based on cumulative probabilities, including a lognormal example, in Matlab:
> http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/cdffitdemo.html
>
> and other programs like TableCurve 2D seem to do this too.

Maybe, but that is a different problem. If you have an ECDF, the jumps give you the data so you can just use fitdistr(). (And you will see comparing observed and fitted CDFs in MASS, the book.)

> There must be a straightforward method in R which I have overlooked. Any
> suggestion on how can I estimate these parameters in R or helpful references
> are very much appreciated.
>
> (not sure if it helps but) here is an example of my type of data:
>
> treat.1 <- c(21.67, 21.67, 43.38, 35.50, 32.08, 32.08, 21.67, 21.67, 41.33,
> 41.33, 41.33, 32.08, 21.67, 22.48, 23.25, 30.00, 26.00, 19.37, 26.00
> ,
> 32.08, 21.67, 26.00, 26.00, 43.38, 26.00, 21.67, 22.48, 35.50, 38.30,
>
> 32.08)
>
> treat.2 <- c(35.92, 12.08, 12.08, 30.00, 33.cy73, 35.92, 12.08, 30.00,
56.00,
> 30.00, 35.92, 33.73, 12.08, 26.00, 54.00, 12.08, 12.08, 35.92, 35.92
> ,
> 12.08, 33.73, 35.92, 63.20, 30.00, 26.00, 33.73, 23.50, 30.00, 35.92
> ,
> 30.00)
>
> Thank you very much!
>
> Ahimsa
>
>
> --
> ahimsa campos-arceiz
> www.camposarceiz.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.
>

```--
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help