Re: [R] problems with errors in randomization tests

From: CR Bleay, School Biological Sciences <>
Date: Sat 03 Jul 2004 - 02:53:11 EST

I have been having problems with a randomization test.

essentially the goal is to use an original dataset and create a new data set with a pre-specified number of data points removed at random points. then to perform a glm.nb model on the new data set and store the coefficients and statistics from an anova table of the model in a number of arrays . this process is repeated a number of times (say 1000) so that i can perform descriptive stats and so look at the power of the original model as a function of sample size.

The section of code i am having problems with is :

while (countn<repetitions) {

												shit<-unique(sample(x, no)) # randomly selects the data points 
																				to be removed
												density.random2<-density.random1[-shit,] #cretes new dataset
), data=density.random2, na.action=na.omit, control = glm.control(maxit=100)) #performs model
												random.anova<-anova.glm(random.model, test="Chisq")

The problem that i am having is that every so often a data set will be created that will generate the following error that stops the function at the point of the glm.nb function:

Error: NA/NaN/Inf in foreign function call (arg 1)/In addition: Warning message: Step size truncated due to divergence

I have a number of questions about this.

1/ how can i prevent it from exiting the function. i have tried "try" and this will not resolve the issue, if i place it at the glm.nb function it results in an error:

Step size truncated due to divergence
Error in "[<-"(`*tmp*`, countn,
value = random.coefficients[1]) :

        incompatible types

Is it possible to create an "if" step, ie. if error ignore and don't perform the assignment of data to the arrays else continue?

2/ given that a data set that would generate this error will be a valid dataset what should i do about the coefficients etc that are generated, ignoring those datasets would result in selection on my results.

3/ what is the actual cause of the error in the first place with respect to the data and the model

any assistance would be very much appreciated.

i have searched through the archives and could not find a solution. I have to admit that i do not adequately understand error capture and handling in R, and have been unable to find any documentation that gives a good explanation of it.



Dr Colin Bleay
Dept. Biological Sciences,
University of Bristol,
Woodlands rd.,
BS8 1UG.

Tel: 44 (0)117 928 7470
Fax: 44 (0)117 mailing list PLEASE do read the posting guide! Received on Sat Jul 03 02:56:20 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:54:39 EST