[R] Logistic regression goodness of fit tests

From: Trevor Wiens <twiens_at_interbaun.com>
Date: Fri 11 Mar 2005 - 08:06:18 EST


I was unsure of what suitable goodness-of-fit tests existed in R for logistic regression. After searching the R-help archive I found that using the Design models and resid, could be used to calculate this as follows:

d <- datadist(mydataframe)
options(datadist = 'd')
fit <- lrm(response ~ predictor1 + predictor2..., data=mydataframe, x =T, y=T) resid(fit, 'gof').

I set up a script to first use glm to create models use stepAIC to determine the optimal model. I used this instead of fastbw because I found the AIC values to be completely different and the final models didn't always match. Then my script takes the reduced model formula and recreates it using lrm as above. Now the problem is that for some models I run into an error to which I can find no reference whatsoever on the mailing list or on the web. It is as follows:

test.lrm <- lrm(cclo ~ elev + aspect + cti_var + planar + feat_div + loamy + sands + sandy + wet + slr_mean, data=datamatrix, x = T, y = T) singular information matrix in lrm.fit (rank= 10 ). Offending variable(s): slr_mean
Error in j:(j + params[i] - 1) : NA/NaN argument

Now if I add the singularity criterion and make the value smaller than the default of 1E-7 to 1E-9 or 1E-12 which is the default in calibrate, it works. Why is that?

Not being a statistician but a biogeographer using regression as a tool, I don't really understand what is happening here.

Does changing the tol variable, change how I should interpret goodness-of-fit results or other evaluations of the models created?

I've included a summary of the data below (in case it might be helpful) with all variables in the data frame as it was easier than selecting out the ones used in the model.

Thanks in advance.

T

-- 
Trevor Wiens 
twiens@interbaun.com

The significant problems that we face cannot be solved at the same 
level of thinking we were at when we created them. 
(Albert Einstein)

----------------------------
 summary(datamatrix)
     siteid         block         recordyear        cclo       
 564-125:   5   Min.   :1.000   Min.   :2000   Min.   :0.0000  
 564-130:   5   1st Qu.:2.000   1st Qu.:2001   1st Qu.:1.0000  
 564-135:   5   Median :3.000   Median :2002   Median :1.0000  
 564-140:   5   Mean   :3.042   Mean   :2002   Mean   :0.7509  
 564-145:   5   3rd Qu.:4.000   3rd Qu.:2003   3rd Qu.:1.0000  
 564-150:   5   Max.   :5.000   Max.   :2004   Max.   :1.0000  
 (Other):1098                                                  

      elev            slope            aspect          slr_mean   
 Min.   :0.0000   Min.   :0.1499   Min.   :0.0000   Min.   :7681  
 1st Qu.:0.0000   1st Qu.:0.5876   1st Qu.:0.0000   1st Qu.:7852  
 Median :1.0000   Median :0.9195   Median :0.0000   Median :7877  
 Mean   :0.6259   Mean   :1.2523   Mean   :0.2482   Mean   :7871  
 3rd Qu.:1.0000   3rd Qu.:1.6694   3rd Qu.:0.0000   3rd Qu.:7892  
 Max.   :1.0000   Max.   :5.3366   Max.   :1.0000   Max.   :7981  

       cti           cti_var           planar          feat_div    
 Min.   :7.157   Min.   :0.4497   Min.   :0.0000   Min.   :1.000  
 1st Qu.:7.651   1st Qu.:0.6187   1st Qu.:1.0000   1st Qu.:2.000  
 Median :7.720   Median :0.8495   Median :1.0000   Median :3.000  
 Mean   :7.763   Mean   :0.9542   Mean   :0.8254   Mean   :3.379  
 3rd Qu.:7.822   3rd Qu.:1.1918   3rd Qu.:1.0000   3rd Qu.:4.000  
 Max.   :8.769   Max.   :2.5615   Max.   :1.0000   Max.   :6.000  

   chop_san           loamy            sands            sandy       
 Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
 Mean   :0.05762   Mean   :0.3094   Mean   :0.3236   Mean   :0.1099  
 3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
 Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
                                                                     
      wet          timesinceburn         ndvi             evi        
 Min.   :0.00000   Min.   :  1.00   Min.   :0.1140   Min.   :0.1041  
 1st Qu.:0.00000   1st Qu.:100.00   1st Qu.:0.2973   1st Qu.:0.1667  
 Median :0.00000   Median :100.00   Median :0.3342   Median :0.2027  
 Mean   :0.01950   Mean   : 87.84   Mean   :0.3629   Mean   :0.2184  
 3rd Qu.:0.00000   3rd Qu.:100.00   3rd Qu.:0.4463   3rd Qu.:0.2711  
 Max.   :1.00000   Max.   :100.00   Max.   :0.5932   Max.   :0.4788  
                                                                     
     msavi2              fc              gdd            precip      
 Min.   :0.09156   Min.   :0.1552   Min.   :380.6   Min.   : 50.04  
 1st Qu.:0.14936   1st Qu.:0.3246   1st Qu.:492.8   1st Qu.: 76.17  
 Median :0.18257   Median :0.4082   Median :500.8   Median : 85.50  
 Mean   :0.19653   Mean   :0.4398   Mean   :476.4   Mean   : 94.35  
 3rd Qu.:0.24626   3rd Qu.:0.5630   3rd Qu.:501.6   3rd Qu.: 95.16  
 Max.   :0.33258   Max.   :0.6996   Max.   :519.7   Max.   :163.86  
                                                                    
    precip_1        precip_2         slr_yr    
 Min.   :164.2   Min.   :164.2   Min.   :7417  
 1st Qu.:254.2   1st Qu.:254.2   1st Qu.:7704  
 Median :338.0   Median :357.1   Median :7775  
 Mean   :298.1   Mean   :301.5   Mean   :7828  
 3rd Qu.:357.1   3rd Qu.:360.5   3rd Qu.:8014  
 Max.   :414.2   Max.   :414.2   Max.   :8151

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Mar 11 14:00:21 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:42 EST