Re: [R] mixture distribution analisys

From: Rubén Roa-Ureta <rroa_at_udec.cl>
Date: Thu, 10 Apr 2008 19:53:21 -0400

Antonio Olinto wrote:
> Hello,
>
> I'm analyzing some fish length-frequency data to access relative age and growth
> information and I would like to do something different from FiSAT / ELEFAN analysis.
>
> I found a package named MIX in http://www.math.mcmaster.ca/peter/mix/mix.html
> but it's compiled for R 2.0.0
>
> So I have two questions:
> Is there any problem to run this package in R 2.7.0? - probably yes …
> And, is there any other package in R that can be used to analyze and to separate
> mixture distributions?
>
> Thanks for any help. Best regards,
>
> Antonio Olinto
> Sao Paulo Fisheries Institute
> Brazil

This problem is not too difficult. Maybe you can write your own code. Say, you enter the number of fish you measured, n, and a two-column dataframe with the mid point of the length class in the first column (call it l) and the observed relative frequency of each length class in the second column (call it obs_prob). Then using the multinomial likelihood for the number of fish in each length class as the random variable (the approach in Peter Macdonald's Mix), minimize log_likelihood=-n*sum(elementwise product(obs_prob,log(pred_prob))) where

pred_prob=(p1*0.3989423/s1)*exp(-0.5*square((l-m1)/s1))
     +(p2*0.3989423/s2)*exp(-0.5*square((l-m2)/s2))
     +(p3*0.3989423/s3)*exp(-0.5*square((l-m3)/s3))
where s1, s2, s3, m1, m2, m3, p1 and p2 (p3=1-p1+p2) are the parameters. Do you know your number of components (cohorts) in the mixture? In the model above it is three. If you don't know it, try several models with different number of components and the AIC may let you know which one is the best working model. Don't use the likelihood ratio test as one of the p parameters is on the boundary of parameter space if the null is true.
Also, you should bound the parameters (m1<m2<m3, 0<p1<1, 0<p2<1, and establish the condition p3=1-p1+p2.
I wrote ADMB code to do this if you want it. You can translate it to R. Below you can find some example code to do the graphics. R.

cont <-
c(4,15,32,44,62,69,61,99,114,119,106,108,89,88,95,99,91,84,137,190,224,297,368,484,566,629,446,349,377,405,438,367,215,115,36,10,4,1,0,0,1) tot <- sum(cont)
rel.freq <- rep(cont,each=10)/tot/10
lect <- seq(9.1,50,0.1)

prop1 <- 0.0219271
prop2 <- 0.121317
prop3 <- 0.0328074
prop4 <- 0.315534
prop5 <- 0.203071
prop6 <- 0.3053439
sigma1 <- 1.50760
sigma2 <- 2.82352
sigma3 <- 1.40698
sigma4 <- 3.00063
sigma5 <- 1.41955
sigma6 <- 1.99940
media1 <- 13.4148
media2 <- 19.2206
media3 <- 24.5748
media4 <- 31.9498

media5 <- 34.6998
media6 <- 39.7016
prot1<-(1/10)*(prop1*(1/sqrt(2*pi))/sigma1)*exp(-0.5*((lect-(media1-0.5))/sigma1)^2)
prot2<-(1/10)*(prop2*(1/sqrt(2*pi))/sigma2)*exp(-0.5*((lect-(media2-0.5))/sigma2)^2)
prot3<-(1/10)*(prop3*(1/sqrt(2*pi))/sigma3)*exp(-0.5*((lect-(media3-0.5))/sigma3)^2)
prot4<-(1/10)*(prop4*(1/sqrt(2*pi))/sigma4)*exp(-0.5*((lect-(media4-0.5))/sigma4)^2)
prot5<-(1/10)*(prop5*(1/sqrt(2*pi))/sigma5)*exp(-0.5*((lect-(media5-0.5))/sigma5)^2)
prot6<-(1/10)*(prop6*(1/sqrt(2*pi))/sigma6)*exp(-0.5*((lect-(media6-0.5))/sigma6)^2)
prot.tot<-prot1+prot2+prot3+prot4+prot5+prot6 plot(lect,rel.freq,type="l",lwd=3,xlab="",ylab="",ylim=c(0,0.01))
lines(lect,prot1)
lines(lect,prot2)
lines(lect,prot3)
lines(lect,prot4)
lines(lect,prot5)
lines(lect,prot6)
lines(lect,prot.tot,lwd=2)

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 10 Apr 2008 - 23:59:12 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 11 Apr 2008 - 00:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive