# [R] Antwort: Re: an alternative to R for nonlinear stat models

From: <benedikt.gehr_at_ieu.uzh.ch>
Date: Wed, 16 Jun 2010 20:33:34 +0200

Hi

I think my first answer doesn't seem to have gone through due to the    attachement. So I copy pasted my last answer including the code below.

Hi Paul

Oh ok sorry, I send you below a copy of the R code I was using. It's very    possible that my programming is slow as well. I'm very happy to learn.

Sorry the code is a bit long, but its structured like this and cotains three    main functions and the optimization:

1. Input - (I didn't send the data)
2. Function to calculate the cell probabilities for the multinomial likelihood
3. Function to calculate the multinomial

4.Function to be optimized

5. then I set the starting values

6. optimization using mle2

hope that helps

cheers

Beni

###########################################################################
###########################################################################

#library(boot)
#library(bbmle)
#library(demogR)

##Simulated harvest numbers of a cohort harvested over 30 years (including
random noise)

```   pop<-N.true       # True population size
cohort<-C       # harvest matrix
```

R<-nrow(cohort)
L<-ncol(cohort)

h<-matrix(rep(Q,R),ncol=a,byrow=T) # estimated harvest rates    S<-matrix(rep(P[-1],R),ncol=a-1,byrow=T) # survival rates

##############################################
##Function to calculate the cell probabilities
##############################################

predprob <- function(S=S,h=h) {

```     R<-nrow(cohort)
L<-ncol(cohort)
p <- matrix(rep(0,(R)*(L)),ncol=L)  # matrix of cell probabilities
p[R,1]<-h[R,1]
p[1,L]<-h[1,L]
ay<-rep(0,L+R+1)     #   vector  of  last  cell  probabilities  ->
1-sum(rest));Gove(2002)
ay[L+R]<-1-h[1,L]
ay<-1-h[R,1]
index<-3       #   ay=Null,   ay=1-h[R,1],ay[L+R]=1-h[1,L],
```
ay[L+R+1]=Null

for (i in -(R-2):(L-2)){ # Calculating the cell prob after Gove (2002)

p[row(p) == col(p) - i] <-
odiag(h,i)*c(1,cumprod((1-odiag(h[1:(R-1),1:(L-1)],i))

*odiag(S[1:(R-1),],i)))

ay[index]<-1-sum(odiag(p[1:R,],i))

index<-index+1
}

p<-rbind(p,ay[1:L])
p<-cbind(p,ay[length(ay):(L+1)])
return(p)

}

predprob(S,h) #test the function

####################################
##Function to calculate the multinomial ->from Ben Bolker
####################################

## multinom WITHOUT ROUNDING & without error-checking
dmnom <- function(x,prob,log=FALSE) {

```     r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
if (log) r else exp(r)
```

}

###########################
## function to be optimized
###########################

mfun <- function(logN,S=S,h=h,cohort=cohort) {

```     L<-ncol(cohort)
R<-nrow(cohort)
d<-matrix(rep(NA,(R+1)*(L+1)),ncol=L+1)  # last row and last col of the
matrix
d[1:R,1:L]<-log(cohort)    #  are  the  values to be optimized->Log
```
tranformed data!
```     index<-1
index1<-(R+L)

```

for (i in -(R-1):(L-1)){ # values for the top row    if (i<= -(R-L) ){
d[(R+1),(index+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))} else    if (i> -(R-L)& i<1 ){
d[index1,(L+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))} else    if (i>0){
d[index1,(L+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))}

index<-index+1
index1<-index1-1
}

pp<-predprob(S,h)

lhood<-0 # The Likelihood function -> to be maximized

for (i in -(R-1):(L-1)){
lhood<-lhood+(-dmnom(exp(odiag(d,i)),prob=odiag(pp,i),log=TRUE))    }
return(lhood)
}

logN<-log(rep(1000,(R+L-1))) # Test the function with test-values    mfun(logN,S,h,cohort)

############################################################################
##############
############################################################################
##############

##########
##Starting values for the optimization
##########

```   N<-rep(500,(R+L-1))     # optimization,->N-x1-x2-x3
svec = log(N)       # transform to avoid constraints (x>0)
```
names(svec) <- paste("logN",1:(R+L-1),sep="")    parnames(mfun) <- names(svec)

##########
##Lower bounds for the L-BFGS-B-method
##########

index<-1
lower<-rep(0,(R+L-1))

for (i in -(R-1):(L-1)){
lower[index]<-log(sum(odiag(cohort,i)))    index<-index+1
}

lower<-lower+0.001
names(lower)<-names(svec)
exp(lower) # check lower bounds
#########################
##Optimization with mle2
#########################
## use bounded optimization

m1 = mle2(mfun,start=svec,
method="L-BFGS-B",lower=lower,upper=10, # lower=must be larger than    "sum(odiag(x))"<-if smaller NANs are produced

```     vecpar=TRUE,
data=list(S=S,h=h,cohort=cohort))

```

coef(m1)
summary(m1)
Nest<-exp(coef(m1)) # estimated cohort sizes for row and colum 1

-----Paul Hiemstra <p.hiemstra_at_geo.uu.nl> schrieb: -----

```     An: benedikt.gehr_at_ieu.uzh.ch
Von: Paul Hiemstra <p.hiemstra_at_geo.uu.nl>
Datum: 16.06.2010 09:36
Kopie: r-help_at_r-project.org, davef_at_otter-rsch.com
Betreff: Re: [R] an alternative to R for nonlinear stat models
On 06/16/2010 07:35 AM, benedikt.gehr_at_ieu.uzh.ch wrote:
>     Hi
>     I implemented the age-structure model in Gove et al (2002) in R,
which is a
>     nonlinear statistical model. However running the model in R was very
slow.
>     So Dave Fournier suggested to use the AD Model Builder Software
package and
>     helped me implement the model there.
>     ADMB was incredibly fast in running the model:
>     While running the model in R took 5-10 minutes, depending on the
settings,
>     in ADMB it took 1-2 seconds!
>     I'm reporting this so that people who have performance issues with
nonlinear
>       statistical models in R will know that there is a good free
alternative for
>     more difficult problems.
>     There  is also a help platfrom equivalent to the one for R, and
people
>     running it are extremley helpful.
>     I hope this might help someone
>     cheers
>     Beni
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Hi Beni,
Thanks for posting information that might be useful for people on the
list. The only thing is that without a little more detail on how exactly
you implemented things in R, we are left to guess if the performance
issues are a problem of R, or that your particular implementation was
the problem. There are was of implementing R code in two ways, where the
first takes minutes and the second 1-2 seconds. Furthermore, you are
giving us no option to defend R ;).
cheers,
Paul
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

```

References

1. https://stat.ethz.ch/mailman/listinfo/r-help
2. http://www.r-project.org/posting-guide.html
3. http://intamap.geo.uu.nl/~paul