Re: [R] Fitting a distribution to peaks in histogram

From: Ulrik Stervbo <ulriks_at_ruc.dk>
Date: Thu 20 Jul 2006 - 19:08:27 EST

On 7/19/06, hadley wickham <h.wickham@gmail.com> wrote:
> > Can you be a bit more excact? I a biologist and relatively new to R
>
> In that case, I would _strongly_ advise that you get advice from a
> local statistician.

I am afraid that, by comparison, I am the local statistican. I am also the local R-guru, and neither is saying much - so please bear with me.

Do you know of some functions (built in hopefully) that I can try?

I did try the density estimate from the Mclust package, but got an out of memory error. I did look at the Ash package, but I am afraid I failed to see how I can use it.

At the moment, I am estimating the density, using the stats density(), identify the peaks in the density estimate by Petr's function, and can thus extract a very good suggestion for a mean and intensity for each peak - surely that must be useful for something? Based on the literature I also have a very good suggestion for at upper and lower width of the distribution.

>
> > I am measureing the amount of DNA in cells, and I need to know the
> > percentage of cells in a part of the cell cycle; that the percentage
> > of cells in the first peak, in the second peak and so on. I want to
> > integrate the area between to two cells, because that apparently is
> > how its none (as far as I can tell from the literature)
>
> That doesn't sound quite right to me, because you also need to take
> into account the fact that some cells between peak 1 and 2 belong to
> peak 1, and some to peak 2. This is something that will come out
> immediately from a mixture based approach. If you know that peaks
> correspond to certain parts of the cell cycle, then this is important
> information that should be included in the analysis.

I realise that some cells between to peaks belong to the peaks, but thought that this was a general problem, usually sacrificed for speed. One of the most widely used programs for analysing cell cycle use a variant of my strategy as far as I can tell; fitting Gaussian distributions to the two peaks and integrate the part between. The reason why I am not using this program is that I cannot afford it, and it does a very poor job when analysing cells with abnormal amounts of DNA. Ulrik

-- 
Blog: http://ulrikstervbo.blogspot.com

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu Jul 20 19:30:18 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 20 Jul 2006 - 22:17:32 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.