From: Petr PIKAL <petr.pikal_at_precheza.cz>

Date: Mon, 23 Jun 2008 16:34:29 +0200

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 23 Jun 2008 - 15:02:24 GMT

Date: Mon, 23 Jun 2008 16:34:29 +0200

Hi

Petr

petr.pikal_at_precheza.cz

724008364, 581252140, 581252257

r-help-bounces_at_r-project.org napsal dne 20.06.2008 20:55:08:

> Greetings,

*>
**> I have stumbled across some unexpected behavior (potential a bug) in,
*

what I

> suspect to be R's (2.6.2 on Ubuntu Linux) t.test function; then again

the

> problem may exist in my code. I have shutdown R and started it back up,

*> re-run the code and re-experienced the error. I have searched on Google
*

for

> the abnormal termination error message "(stderr < 10 *

.Machine$double.eps *

> max(abs(mx), abs(my))) stop("data are essentially constant")" but only

found

> one instance, http://tolstoy.newcastle.edu.au/R/e2/help/07/06/18179.html

,

> but the discussion there did not seem particularly helpful.

*>
**> I've included all of my code, amateurish though it may be. I have not
**> isolated the faulty part, and to me it all looks pretty simple, so I'm
*

not

> sure where I'm going wrong. For background, the goal of this code is to

run

> a simulation to explore the problem space of inflation of Type I error

when

> decisions to run or not to run more participants are made by preliminary

*> looks at the data (as in Wagenmakers, 2007). This code is meant to
*

examine

> the problem space given that there is no true difference between the

groups

> (as is the case when both a generated from random draws from the normal

*> distribution). I run an initial number of subjects in two groups (t1N)
*

then,

> if p is < .25 on the t-test I add t2N more subjects to each group. Then

I

> perform the t.test again. If the p was > .25 at time 1 I stop. Plainp is

*> simply storing the p-values from t2 (if it was performed) or from t1 (if
*

t2

> was not performed). In the code I provide t1 starts at 16 since this is

*> about when the problem becomes more frequent. Please note that it takes
**> quite a long while to fail, and depending on what the true cause is it
*

may

> not fail at all. On my system it is failing before t1N advances to 17.

*>
**> Any suggestions as to how to avoid the error and instructions as to the
**> cause of it would be appreciated. Thank you for your input and patience.
**>
**> logit <- function(p)
**>
**> {
**>
**> # compute and return logit of p;
**>
**> # if p=.5 then logit==0 else sign(logit)==(p>.5)
**>
**> return( log(p/(1-p)) )
**>
**> }
**>
**>
**> antilogit <- function(x)
**>
**> {
**>
**> # compute and return antilogit of x;
**>
**> # this returns a proportion p for which logit(p)==x;
**>
**> return( exp(x)/(1+exp(x)) )
**>
**> }
**>
**>
**> plainp <- c() #Clear the plainp value
**>
**> t1Nsim <- (100/5) * 1000 * 10 # random chance should provide 10000 cases
*

at

*> t1
**>
*

> contthreshold <- .25 #p value below which we run more subjects

*>
**> t1pvals <- rep(NA,t1Nsim) #clear the pvalues
**>
**> t2pvals <- rep(NA,t1Nsim) #clear the pvalues
**>
**> t1N <- 10 #for debugging
**>
**> t2N <- 5 #for debugging
**>
**>
**> for (t1N in 16:50) #Outer loop testing possible values for t1N
**>
**> for (t2N in 1:50) #Inner loop testing possible values for t2N
**>
**> {
**>
**> print(paste("Checking with ",t1N," initial samples and ",t2N," extra
**> samples",sep="")) #feedback
**>
**> for (lcv in 1:t1Nsim) #Run simulation t1Nsim times...
**>
**> {
**>
**> if (lcv %% 20000 == 0) {print(paste((lcv/t1Nsim)*100,"%",sep=""))}
*

#feedback

*>
*

> Cgroup <- rnorm(t1N) #Initial random draw for Group1

*>
**> Tgroup <- rnorm(t1N) #Initial random draw for Group 2
**>
**> currentp <- t.test(Cgroup,Tgroup)[["p.value"]] #Get t1 p value
**>
**> t1pvals[lcv] <- currentp #Store t1 p value
**>
**> #If p >= .05 or <= continue threshold then run more subjects
**>
**> if ((currentp <= contthreshold) & (currentp >= .05)) {
**>
**> Cgroup <- c(Cgroup,rnorm(t2N)) #Add t2N subjects to group 1
**>
**> Tgroup <- c(Tgroup,rnorm(t2N)) #Add t2N subjects to group 2
**>
**> currentp <- t.test(Cgroup,Tgroup)[["p.value"]] #Get t2 p value
**>
**> t2pvals[lcv] <- currentp #store t2 p value
**>
**> }
**>
**> }
**>
**> plainp <- ifelse(!is.na(t2pvals),{t2pvals},{t1pvals}) #Make sure we are
**> looking at the right ps
**>
**> table(t1pvals <= .05); round(summary(t1pvals),4) #debugging
**>
**> hist(t1pvals) #debugging
**>
**> table(plainp <= .05); round(summary(plainp),4) #debugging
**>
**> hist(plainp, probability=TRUE,main=paste(t1N,"then",t2N)) #Histogram of
**> interest
**>
**> abline(a=1.00,b=0) #Baseline probability
**>
**> dev.copy(jpeg,filename=paste("Sim with ",t1N, "start samples and ",t2N,"
**> extra samples.jpg",sep=""),height=600,width=800,bg="white") # Create the
**> image
**>
**> dev.off() #Save the image
**>
**> chi <- rbind(table(t1pvals <= .05),table(plainp <= .05)) #debugging
**>
**> chisq.test(chi) #debugging
**>
**> explore <- data.frame(t1=t1pvals,t2=t2pvals,picked=plainp) #debugging
**>
**> t.test(explore$picked,explore$t1) #debugging
**>
**> t.test(logit(explore$picked),logit(explore$t1)) #debugging
**>
**> }
**>
**> ---
**> Russell Pierce
**> Psychology Department
**> Graduate Student - Cognitive
**> (951) 827-2553
**> University of California, Riverside, 92521
**>
**> [[alternative HTML version deleted]]
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
*

http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 23 Jun 2008 - 15:02:24 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 23 Jun 2008 - 16:30:53 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*