[R] trouble with step() and stepAIC() selecting the best model

From: natalia norden <norden_at_cict.fr>
Date: Fri 05 May 2006 - 19:14:18 EST


Hello,
I have some trouble using step() and stepAIC() functions. I'm predicting recruitment against several factors for different plant species using a negative binomial glm.

Sometimes, summary(step(model)) or summary(stepAIC(model) does not select the best model (lowest AIC) but just stops before. For some species, step() works and stepAIC don't and in others, it's the opposite.

The model is

  model=glm.nb(sdg.cens1 ~ seed.cens1 + log(in.lit+1) + log(trans.dir+1)+ log(trans.dif+1) + slope +log(pH+1) + log(CN+1) + site,

         data=data)

For example, for species A using stepAIC() I get:

Call:
glm.nb(formula = tot.rec ~ slope + log(pH + 1) + log(CN + 1),

     data = data, init.theta = 0.0575099635396228, link = log)

Deviance Residuals:

        Min 1Q Median 3Q Max -7.591e-01 -3.688e-01 -1.828e-01 -8.494e-08 1.520e+00

Coefficients:

               Estimate Std. Error   z value Pr(>|z|)

(Intercept) 110.729 47.643 2.324 0.02012 *
slope2 -34.202 972079.452 -3.52e-05 0.99997 slope3 -1.423 1.371 -1.038 0.29928 log(pH + 1) -51.244 19.544 -2.622 0.00874 ** log(CN + 1) -9.132 6.906 -1.322 0.18602
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


(Dispersion parameter for Negative Binomial(0.0575) family taken to be 1)
Null deviance: 34.068 on 156 degrees of freedom Residual deviance: 19.602 on 152 degrees of freedom AIC: 90.357 But using step() I get: Step: AIC= 138.4 tot.rec ~ log(mean.lit.rec + 1) + log(trans.dir + 1) + log(trans.dif + 1) + slope + log(pH + 1) + log(CN + 1) + site Call: glm.nb(formula = tot.rec ~ log(mean.lit.rec + 1) + log(trans.dir + 1) + log(trans.dif + 1) + slope + log(pH + 1) + log(CN + 1) + site, data = data, init.theta = 74070.0501196965, link = log) Deviance Residuals: Min 1Q Median 3Q Max -2.3110 -0.5450 -0.3179 0.0000 3.8755 Coefficients: Estimate Std. Error z value Pr(>|z|)
(Intercept) 6.809e+01 1.905e+01 3.574 0.000351 ***
log(mean.lit.rec + 1) -1.238e+00 1.083e+00 -1.143 0.252987 log(trans.dir + 1) 2.069e-01 1.227e+00 0.169 0.866120 log(trans.dif + 1) -6.978e-01 1.578e+00 -0.442 0.658327 slope2 -3.730e+01 1.022e+07 -3.65e-06 0.999997 slope3 -9.195e-01 6.269e-01 -1.467 0.142488 log(pH + 1) -3.152e+01 6.918e+00 -4.556 5.22e-06 *** log(CN + 1) -5.053e+00 3.404e+00 -1.485 0.137643 site1 1.379e-01 4.618e-01 0.299 0.765278 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(91071.54) family taken to be 1)
Null deviance: 145.586 on 156 degrees of freedom Residual deviance: 99.166 on 148 degrees of freedom AIC: 140.40 However, for species B I get the opposite, it works with step() but not with stepAIC(). Using step() AIC =414.28 and using stepAIC(), AIC=424.15 and the results are totally different. I would appreciate any help!!
Thank you Natalia Norden ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri May 05 19:32:24 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 05 May 2006 - 22:10:00 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.