From: Ted Harding <Ted.Harding_at_manchester.ac.uk>

Date: Wed, 27 Jan 2010 18:14:59 +0000 (GMT)

*>
*

> x1 = c(rep(0, 244), rep(1, 209))

*> x2 = c(rep(0, 177), rep(1, 67), rep(0, 169), rep(1, 40))
*

*>
*

*> or1 = sum(x1==1&x2==1)*sum(x1==0&x2==0)/
*

*> (sum(x1==1&x2==0)*sum(x1==0&x2==1))
*

*>
*

*> library(epitools)
*

*> or2 = oddsratio.wald(x1, x2)$measure[2,1]
*

*>
*

*> or3 = fisher.test(x1, x2)$estimate
*

*>
*

*>
*

> I'm running R 2.10.1 under Mac OS X 10.6.2.

*> Nick
*

E-Mail: (Ted Harding) <Ted.Harding_at_manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 27 Jan 2010 - 18:19:28 GMT

Date: Wed, 27 Jan 2010 18:14:59 +0000 (GMT)

On 27-Jan-10 17:30:10, nhorton_at_smith.edu wrote:

># is there a bug in the calculation of the odds ratio in fisher.test? ># Nicholas Horton, nhorton_at_smith.edu Fri Jan 22 08:29:07 EST 2010

> x1 = c(rep(0, 244), rep(1, 209))

># or1=or2 = 0.625276, but or3=0.6259267!

> I'm running R 2.10.1 under Mac OS X 10.6.2.

Not so. Look closely at ?fisher.test:

Value:

[...]

estimate: an estimate of the odds ratio. Note that the

_conditional_ Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used. Only present in the 2 by 2 case.

Your or1 (and presumably the epitools value also) is the sample OR.

The conditional MLE is the value of rho (the OR) that maximises the probability of the table *conditional* on the margins.

In this case it differs slightly from the sample OR (by 0.1%). For smaller tables it will tend to differ even more, e.g.

M1 <- matrix(c(4,7,17,18),nrow=2)

M1

# [,1] [,2] # [1,] 4 17 # [2,] 7 18

(4*18)/(17*7)

# [1] 0.605042

fisher.test(M1)$estimate

# odds ratio

# 0.6116235 ## (1.1% larger than sample OR)

M2 <- matrix(c(1,2,4,5),nrow=2)

M2

# [,1] [,2] # [1,] 1 4 # [2,] 2 5

(1*5)/(4*2)

# [1] 0.625

fisher.test(M2)$estimate

# odds ratio

# 0.649423 ## (3.9% larger than sample OR)

The probability of a table matrix(c(a,b,c,d),nrow=2) given the marginals (a+b),(a+c),(b+c) and hence also (c+d) is a function of the odds ratio only. Again see ?fisher.test:

"given all marginal totals fixed, the first element of the contingency table has a non-central hypergeometric distribution with non-centrality parameter given by the odds ratio (Fisher, 1935)."

The value of the odds ratio which maximises this (for given observed 'a') is not the sample OR.

Hoping this helps,

Ted.

E-Mail: (Ted Harding) <Ted.Harding_at_manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861

Date: 27-Jan-10 Time: 18:14:57 ------------------------------ XFMail ------------------------------ ______________________________________________R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 27 Jan 2010 - 18:19:28 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 27 Jan 2010 - 23:00:17 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*