Re: [R] replicating the odds ratio from a published study

From: Bob Green <>
Date: Sun 28 Jan 2007 - 21:13:46 GMT


Thanks. Yes, clearly the volume number for the Schanda paper I cited is wrong.

Where things are a bit perplexing, is that I used the same method as Peter suggested on two papers by Eronen (referenced below). I can reproduce in R a similar odds ratio to the first published paper e.g OR = 9.7 (CI= 7.4-12.6) whereas I obtained quite different results from the second published paper (Eronen 2) of OR = 10.0 (8.1-12.5). One reason why I wanted to work out the calculations was so I could analyse data from studies using the same method, for confirmation.

Now the additional issue, is that Woodward, who is also the author of an epidemiological text, says in a review that Eronen used wrong formula in a 1995 paper and indicates that this comment applies also to later studies - he stated the "they use methods designed for use with binomial data when they really have Poisson data. Consequently, they quote odds ratios when they really have relative rates and their confidence intervals are inaccurate". Eronen1 cites the formula that was used for OR. Schanda sets out his table for odds ratio the same as Eronen1

For the present purpose, my primary question is: as you have now seen the Schanda paper, would you consider Schanda calculated odds or relative risk?

Also, when I tried the formula suggested by Peter (below) I obtained an error - do you know what M might be or the source of the error?

exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M))) Error in sum(1/M) : object "M" not found

 > eronen1 <- as.table(matrix(c(58,852,13600-58,1947000-13600-852), ncol = 2 , dimnames = list(group=c("scz", "nonscz"), who= c("sample", "population"))))  > fisher.test(eronen1)

p-value < 2.2e-16
alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval:

   7.309717 12.690087
sample estimates:
odds ratio


 > eronen2 <- as.table(matrix(c(86,1302,13530-86,1933000-13530-1302), ncol = 2 , dimnames = list(group=c("scz", "nonscz"), who= c("sample", "population"))))
 > fisher.test(eronen2)

p-value < 2.2e-16
alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval:

   7.481272 11.734136
sample estimates:
odds ratio



Eronen, M. et al. (1996 - 1) Mental disorders and homicidal behavior in Finland. Archives of General Psychiatry, 53, 497-501

Eronen, M et al (1996 - 2). Schizophrenia & homicidal behavior. Schizophrenia Bulletin, 22, 83-89

Woodward, Mental disorder & homicide. Epidemiologia E Psichiatria Sociale, 9, 171-189

Any comments are welcomed,


At 01:57 PM 28/01/2007 +0000, Michael Dewey wrote:
>At 22:01 26/01/2007, Peter Dalgaard wrote:
>>Bob Green wrote:
>>>Peetr & Michael,
>>>I now see my description may have confused the issue. I do want to
>>>compare odds ratios across studies - in the sense that I want to create
>>>a table with the respective odds ratio for each study. I do not need to
>>>statistically test two sets of odds ratios.
>>>What I want to do is ensure the method I use to compute an odds ratio is
>>>accurate and intended to check my method against published sources.
>>>The paper I selected by Schanda et al (2004). Homicide and major mental
>>>disorders. Acta Psychiatr Scand, 11:98-107 reports a total sample of
>>>1087. Odds ratios are reported separately for men and women. There were
>>>961 men all of whom were convicted of homicide. Of these 961 men, 41
>>>were diagnosed with schizophrenia. The unadjusted odds ratio is for
>>>this group of 41 is cited as 6.52 (4.70-9.00). They also report the
>>>general population aged over 15 with schizophrenia =20,109 and the total
>>>population =2,957,239.
>Looking at the paper (which is in volume 110 by the way) suggests that
>Peter's reading of the situation is correct and that is what the authors
>have done.
>>>Any further clarification is much appreciated,
>>A fisher.test on the following matrix seems about right:
>> > matrix(c(41,920,20109-41,2957239-20109-920),2)
>> [,1] [,2]
>>[1,] 41 20068
>>[2,] 920 2936210
>> > fisher.test(matrix(c(41,920,20109-41,2957239-20109-920),2))
>> Fisher's Exact Test for Count Data
>>data: matrix(c(41, 920, 20109 - 41, 2957239 - 20109 - 920), 2)
>>p-value < 2.2e-16
>>alternative hypothesis: true odds ratio is not equal to 1
>>95 percent confidence interval:
>>4.645663 8.918425
>>sample estimates:
>>odds ratio
>> 6.520379
>>The c.i. is not precisely the same as your source. This could be down to
>>a different approximation (R's is based on the noncentral hypergeometric
>>distribution), but the classical asymptotic formula gives
>> > exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M)))
>>[1] 4.767384 8.918216
>>which is closer, but still a bit narrower.
>Michael Dewey
> mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Mon Jan 29 12:52:44 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 29 Jan 2007 - 11:30:26 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.