Re: [R] Logistic regression goodness of fit tests

From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>
Date: Fri 11 Mar 2005 - 09:19:41 EST

Trevor Wiens wrote:
> I was unsure of what suitable goodness-of-fit tests existed in R for logistic regression. After searching the R-help archive I found that using the Design models and resid, could be used to calculate this as follows:
>
> d <- datadist(mydataframe)
> options(datadist = 'd')
> fit <- lrm(response ~ predictor1 + predictor2..., data=mydataframe, x =T, y=T)
> resid(fit, 'gof').
>
> I set up a script to first use glm to create models use stepAIC to determine the optimal model. I used this instead of fastbw because I found the AIC values to be completely different and the final models didn't always match. Then my script takes the reduced model formula and recreates it using lrm as above. Now the problem is that for some models I run into an error to which I can find no reference whatsoever on the mailing list or on the web. It is as follows:
>
> test.lrm <- lrm(cclo ~ elev + aspect + cti_var + planar + feat_div + loamy + sands + sandy + wet + slr_mean, data=datamatrix, x = T, y = T)
> singular information matrix in lrm.fit (rank= 10 ). Offending variable(s):
> slr_mean
> Error in j:(j + params[i] - 1) : NA/NaN argument
>
>
> Now if I add the singularity criterion and make the value smaller than the default of 1E-7 to 1E-9 or 1E-12 which is the default in calibrate, it works. Why is that?
>
> Not being a statistician but a biogeographer using regression as a tool, I don't really understand what is happening here.
>

> Does changing the tol variable, change how I should interpret goodness-of-fit results or other evaluations of the models created?
>
> I've included a summary of the data below (in case it might be helpful) with all variables in the data frame as it was easier than selecting out the ones used in the model.
>
> Thanks in advance.
>
> T

The goodness of fit test only works on prespecified models. It is not valid when stepwise variable selection is used (unless perhaps you use alpha=0.5).

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Mar 11 13:41:48 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:42 EST