Re: [R] Fitting a distribution to peaks in histogram

From: Petr Pikal <petr.pikal_at_precheza.cz>
Date: Wed 19 Jul 2006 - 22:54:51 EST


Hi

There are some packages for mass spectra processing (spectrino, caMassClass). I did not use them so I do not know how they suit your needs.

However you can compute area (integrate) by these functions

# uses information interactively from plot(x,y)
# first it replots data between corners *replot(x,y)*
# then it computes sum between x axis and y values - osum -
# and between "baseline" and y values - cista - based
# on locator positions

integ<-function (x,y)
{
replot(x,y)
meze<-locator(2)
dm<-meze$x[1]
hm<-meze$x[2]
abline(v=c(dm,hm),col=2)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])- min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}

# similar as integ but you has to supply upper and lower limits (dm, # hm) manually if you do not want to perform "integration" of whole # area under the curve.

integ1<-function (x,y,dm=-Inf,hm=+Inf)
{

ifelse(dm==-Inf, dm<-min(x), dm<-dm)
ifelse(hm==+Inf, hm<-max(x), hm<-hm)
vyber<-x<=hm&x>=dm

f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])- min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}

On 19 Jul 2006 at 11:58, Ulrik Stervbo wrote:

Date sent:      	Wed, 19 Jul 2006 11:58:38 +0200
From:           	"Ulrik Stervbo" <ulriks@ruc.dk>
To:             	r-help@stat.math.ethz.ch
Subject:        	[R] Fitting a distribution to peaks in histogram

> Hello list!
>
> I would like to fit a distribution to each of the peaks in a
> histogram, such as this:
> http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-Bcl-xL.pn
> g .
>
> The peaks are identified using Petr Pikal peaks function (
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html), but after
> that I am quite stuck.
>
> Any idea as to how I can:
> Fit a distribution to each peak
> Integrate the area between each two peaks, using the means and widths
> of the distributions fitted to the two peaks. I will be using the
> integrate function
>
> The histogram is based on approximately 15000 events, which makes
> Mclust and pam (which both delivers the information I need) less

> useful.
>
> The whole point of this exercise is to find the percentage of cells in
> peak 1, 2, 3, and so on, and between peak 1-2, peak 2-3, peak 3-4 and
> so on. Having more that 6 peaks does not appears likely.
>
> I am quite new to R and apologise if the solution is fairly basic.
>
> Thank you in advance for any help and suggestions
>
> Sincerely,
> Ulrik
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
petr.pikal@precheza.cz



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jul 19 22:58:55 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 20 Jul 2006 - 00:17:58 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.