# [R] binom.test() query

From: Ted Harding <ted.harding_at_nessie.mcc.ac.uk>
Date: Thu 05 Apr 2007 - 21:35:44 GMT

I now have a query!

The code for the two-sided case computes the p-value as follows:

```      if (p == 0) (x == 0)
else
if (p == 1) (x == n)
else {
relErr <- 1 + 1e-07
d <- dbinom(x, n, p)
m <- n * p
if (x == m) 1
else
## PVAL if x < m
if (x < m) {
i <- seq(from = ceiling(m), to = n)
y <- sum(dbinom(i, n, p) <= d * relErr)
## here:
pbinom(x,n,p) + pbinom(n-y,n,p,lower=FALSE)
}
## PVAL if x > m:
else {
i <- seq(from = 0, to = floor(m))
y <- sum(dbinom(i, n, p) <= d * relErr)
## here:
pbinom(y-1,n,p) + pbinom(x-1,n,p,lower=FALSE)
}
}
}

```

The case "x < m" will do for my query (since it is the same for the case "x > m").

If I understand the code correctly, when x (the observed binomial number) is less than m (its expected value on the null hypothesis being tested), then the p-value is the total probability in the lower tail, from 0 to x inclusive, plus a "bit".

The "bit" is the probability in the upper tail of those value from (n-y) to n, where y is the number of values of i greater than or equal to (if m is an integer) m such that

dbinom(i,n,p) <= d*(1 + 1e-07) = dbinom(x,n,p)*(1 + 1e-07).

or, in other words, ignoring the "1e-07", the values of i which have binomial probability (on the NH) less than or equal to the probability of x.

So, for x < m = n*p, the lower-tail contribution to the p-value is the probability that X <= x, while the upper-tail contribution is the probability that (X >= m)&(dbinom(X,n,p) <= dbinom(x,n,p)).

Now that is not so asymmetric as it looks, since for x < m = n*p the values of dbinom(i,n,p) decrease as i decreases for i < x, so in fact the left=hand tail is similarly the probability that (X <= m)&(dbinom(X,n,p) <= dbinom(x,n,p)). In effect, therefore, ignoring the "1e-07", we are simply calculating the probability of "dbinom(X,n,p) <= dbinom(x,n,p)" overall.

```   p<-0.3527785166
n<-20
x<-(4:10)
```

print(cbind(x,dbinom(x,n,p)),digits=10)

x
[1,] 4 0.07114577135
[2,] 5 0.12409328342
[3,] 6 0.16909761793
[4,] 7 0.18433877225
[5,] 8 0.16327483787
[6,] 9 0.11866078116
[7,] 10 0.07114577154

Now I run the standard binom.test():

x<-4 ; n<-20 ; p<-0.3527785166 ; binom.test(x,n,p)\$p.value
 0.2405347

Next, I modify the code for binom.test so that relErr <- 1 (instead of 1 + 1e-07), and I call this new function binom.test.0 and run it on the same data:

binom.test.0(x,n,p)\$p.value
 0.1693889

The difference is the value of dbinom(10,n,p) = 0.07114577...

So the "1e-07" has done its dirty work -- it has added 40% to my p-value! Admittedly it wasn't particularly significant to start with, at 0.17 (in round numbers); but it's much less so at 0.24, and if I wanted I'm sure I could construct a really "significant" example!

Anyway, I dare say the factor (1 + 1e-07) is there as a precaution against some possible numerical mishap (though I can't see what it might be), but the fact that it can make such a difference in special cases means that I think we need to know what it's doing and why -- just in case we don't need whatever it's supposed to be there for.

After all, if I were going to use the values of dbinom(x,n,p) as the ordering of the sample space for the test, I'd simply get PVAL as

sum(dbinom(which(dbinom((0:n),n,p)<=dbinom(x,n,p))-1,n,p))
 0.1693889

(as above!).

With thanks,
Ted.

E-Mail: (Ted Harding) <ted.harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861
```Date: 05-Apr-07                                       Time: 22:29:21
------------------------------ XFMail ------------------------------

______________________________________________
```
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Apr 06 07:45:48 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 05 Apr 2007 - 22:30:55 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.