# Re: [R] How to put factor variables in an nls formula ?

From: Douglas Bates <dmbates_at_gmail.com>
Date: Sat 20 Aug 2005 - 09:04:12 EST

On 8/19/05, Douglas Bates <dmbates@gmail.com> wrote:
> On 8/19/05, François Morneau <francois.morneau@cirad.fr> wrote:
> > Le 11:12 19/08/2005,Douglas Bates écrit:
> > >On 8/18/05, François Morneau <francois.morneau@cirad.fr> wrote:
> > > > Hello,
> > > >
> > > > I want to fit a Gompertz model for tree diameter growth that depends on a 4
> > > > levels edaphic factor ('Drain') and I don't manage to introduce the factor
> > > > variable in the formula.
> > > > Dinc is the annual diameter increment and D is the Diameter.
> > > >
> > > > >treestab
> > > > > Dinc D Drain
> > > > [1,] 0.03 26.10 2
> > > > [2,] 0.04 13.05 1
> > > > [3,] 0.00 24.83 1
> > > > [4,] 0.00 15.92 4
> > > > [5,] 0.00 12.25 4
> > > > [6,] 0.00 11.78 4
> > > > [7,] 0.00 16.87 4
> > > > [8,] 0.00 15.12 4
> > > > [9,] -0.01 13.53 4
> > > > [10,] 0.04 16.55 3
> > > > [11,] 0.025 16.07 3
> > > > [12,] 0.00 30.24 3
> > > > [13,] 0.06 15.28 2
> > > > etc…
> > > > >contrasts(Drain)<-contr.sum(4)
> > > > >mymodel<-nls(Dinc~r*(1+Drain)*D*log(Asym/D), data=treestab,
> > > start=list(r=0.05,
> > > > Asym=40))
> > > >
> > > > Error in numericDeriv(form[[3]], names(ind), env) :
> > > > Missing value or an infinity produced when evaluating the model
> > > > In addition: Warning messages:
> > > > 1: + not meaningful for factors in: Ops.factor(1, Drain)
> > > > 2: + not meaningful for factors in: Ops.factor(1, Drain)
> > >
> > >It is not clear to me what you are trying to do. Can you give us a
> > >bit more information such as how many parameter estimates you expect
> > >to obtain and what they would represent?
> >
> > I'm trying to estimate the effect of the edaphic factor 'Drain' on tree
> > diameter growth. The relationship between diameter 'D' and diameter
> > increment 'Dinc' follow a Gompertz model written : Dinc= r D log(Asym/D)
> > where 'r' is a parameter related to growth speed and 'Asym' is the diameter
> > max where growth stops.
> >
> > My hypothesis is that 'Asym' doesn't change according to edaphic condition
> > but 'r' does. So I'd like to estimate 'Asym' and four 'r'i parameters for
> > each level i of 'Drain' with 'r'i= r0 + effect of 'Drain'i
> >
> > I hope this is helpful...
>
> Yes, it is.

Sorry for the non-informative contribution - I hit the "Send" button when I meant to hit "Cancel".

This is less easy than I thought it would be. The Puromycin data provides a similar example in that there are two sets of observations of concentrations and rates, for treated and untreated cells. The experimenters assumption is that the rate should be related to the concentration via the Michaelis-Menten relationship that depends on two parameters, Asym and K, and that K should not change with treatment but Asym will.

To fit that we can use
> nls(rate ~ (Asym + delta*(state == 'treated'))*conc/(K + conc), Puromycin, start = c(Asym = 180, delta = 30, K = 0.05))
Nonlinear regression model
model: rate ~ (Asym + delta * (state == "treated")) * conc/(K + conc)    data: Puromycin

Asym delta K
166.60407167 42.02596094 0.05797178
residual sum-of-squares: 2240.891

To generalize this it is convenient to form the model matrix with the indicator columns of the state factor.

> mm <- model.matrix(~ 0 + state, Puromycin)

then use matrix multiplication within a call to drop().

> nls(rate ~ drop(mm %*% c(a1,a2)) * conc/(K + conc), Puromycin, start = c(a1 = 210, a2 = 165, K = 0.05), trace = TRUE)

```2822.528 :  210.00 165.00   0.05
2247.841 :  207.77558541 165.94256936   0.05647425
2240.997 :  208.49340568 166.51190757   0.05778192
2240.893 :  208.61339576 166.59293688   0.05794926
2240.891 :  208.62809965 166.60277853   0.05796917
2240.891 :  208.62983812 166.60394147   0.05797152
2240.891 :  208.6300430 166.6040785   0.0579718
```
Nonlinear regression model
model: rate ~ drop(mm %*% c(a1, a2)) * conc/(K + conc)    data: Puromycin

a1 a2 K
208.6300430 166.6040785 0.0579718
residual sum-of-squares: 2240.891

You can do the same thing defining the model matrix of indicators of Drain and a vector of length 4 for D. (By the way, I would avoid the name 'D' for a variable as that is the name of an in-built function in R.)

The reason that I said this is more difficult than I thought it would be is because I thought that the parameters in nls could be vectors but I haven't been able to get that to work.

An alternative is to use the gnls function from the nlme package which does allow a parameter to be specified according to the levels of a factor.

> >
> > > >
> > > > Do I need to use another function instead of nls to correctly include the
> > > > factor 'Drain' ?
> > > >
> > > > Thanks,
> > > >
> > > > François
> >
> > François Morneau
> > UMR Écologie des Forêts de Guyane
> > Kourou, French Guiana
> >
> >
>

R-help@stat.math.ethz.ch mailing list