**From:** Adaikalavan Ramasamy (*gisar@nus.edu.sg*)

**Date:** Tue 22 Apr 2003 - 23:04:08 EST

**Next message:**Thomas Lumley: "Re: [R] lexical scope"**Previous message:**Frank E Harrell Jr: "Re: [R] Dates in read.spss"**Maybe in reply to:**Bob Porter: "[R] fisher exact vs. simulated chi-square"**Next in thread:**Thomas Lumley: "Re: [R] fisher exact vs. simulated chi-square"

Message-id: <024D6AEFCB92CB47BA1085751D184BB80105F247@MBXSRV03.stf.nus.edu.sg>

There is a reason why this is called Fisher's Exact test - it is EXACT.

It calculates all the possible outcomes using permutations. Then it

calculate the p-value as the proportion of number of times of obtaining.

Look for Fisher's Exact test under Section 8a of

http://faculty.vassar.edu/lowry/webtext.html.

Fisher's test is non-parametric and exact but using permutations can be

computationally intensive. For large counts, the parametric chiquare is

ok. When a cell contains too low a count (what is the default limit), R

correctly complains that chiquare may not be appropriate. Hope this

helps.

Regards, Adai.

-----Original Message-----

From: Dirk Janssen [mailto:dirkj@rz.uni-leipzig.de]

Sent: Tuesday, April 22, 2003 8:08 PM

To: r-help@stat.math.ethz.ch

Subject: [R] fisher exact vs. simulated chi-square

Dear All,

I have a problem understanding the difference between the outcome of a

fisher exact test and a chi-square test (with simulated p.value).

For some sample data (see below), fisher reports p=.02337. The normal

chi-square test complains about "approximation may be incorrect",

because there is a column with cells with very small values. I therefore

tried the chi-square with simulated p-values, but this still gives

p=.04037. I also simulated the p-value myself, using r2dtable, getting

the same result, p=0.04 (approx).

Why is this substantially higher than what the fisher exact says? Do the

two tests make different assumptions? I noticed that the discrepancy

gets smaller when I increase the number of observations for column A3.

Does this mean that the simulated chi-square is still sensitive to cells

with small counts, even though it does not give me the warning?

Thanks in advance,

Dirk Janssen

------------------------------------------------------------------

*> ta <- matrix(c(45,85,27,32,40,34,1,2,1),nc=3,
*

dimnames=list(c("A","B","C"),c("A1","A2","A3")))

*> ta
*

A1 A2 A3

A 45 32 1

B 85 40 2

C 27 34 1

*> fisher.test(ta)
*

Fisher's Exact Test for Count Data

data: ta

p-value = 0.02337

alternative hypothesis: two.sided

*> chisq.test(ta, simulate=T, B=100000)
*

Pearson's Chi-squared test with simulated p-value (based on

1e+05

replicates)

data: ta

X-squared = 9.6976, df = NA, p-value = 0.04037

*> chisq.test(ta)
*

Pearson's Chi-squared test

data: ta

X-squared = 9.6976, df = 4, p-value = 0.04584

Warning message:

Chi-squared approximation may be incorrect in: chisq.test(ta)

# simulate values by hand, based on r2dtable example

*> expected <- outer(rowSums(ta), colSums(ta), "*") / sum(ta) meanSqResid
*

*> <- function(x) mean((x - expected) ^ 2 / expected)
*

*> sum(sapply(r2dtable(100000, rowSums(ta), colSums(ta)), meanSqResid)
*

* >= meanSqResid(ta))/ 100000
*

[1] 0.03939

# is similar to

*> sum(sapply(r2dtable(100000, rowSums(ta), colSums(ta)),
*

function(x) { chisq.test(x)$statistic })

* >= 9.6976)/ 100000
*

[1] 0.04044

There were 50 or more warnings (use warnings() to see the first 50)

______________________________________________

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help

______________________________________________

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help

**Next message:**Thomas Lumley: "Re: [R] lexical scope"**Previous message:**Frank E Harrell Jr: "Re: [R] Dates in read.spss"**Maybe in reply to:**Bob Porter: "[R] fisher exact vs. simulated chi-square"**Next in thread:**Thomas Lumley: "Re: [R] fisher exact vs. simulated chi-square"

*
This archive was generated by hypermail 2.1.3
: Tue 01 Jul 2003 - 09:11:43 EST
*