[Rd] limma, FDR, and p.adjust

From: Kimpel, Mark W <mkimpel_at_iupui.edu>
Date: Mon 20 Dec 2004 - 01:57:43 EST

I am posting this to both R and BioC communities because I believe there is a lot of confusion on this topic in both communities (having searched the mail archives of both) and I am hoping that someone will have information that can be shared with both communities.

I have seen countless questions on the BioC list regarding limma
(Bioconductor) and its calculation of FDR. Some of them involved
misunderstandings or confusions regarding across which tests the FDR "correction" is being applied. My question is more fundamental and involves how the FDR method is implemented at the level of "p.adjust"
(package: stats).

I have reread the paper by Benjamini and Hochberg (1995) and nowhere in their paper do they actually "adjust" p values; rather, they develop criteria by which an appropriate p value maximum is chosen such that FDR is expected to be below a certain threshold.

To try to get a better handle on this, I wrote the following simple script to generate a list of random p values, and view it before and after apply p.adjust (method=fdr).  

rn<-abs(rnorm(100, 0.5, 0.33))

p.adj<-p.adjust(rn, method="fdr")

As you can see after running the code, the p values are truly being adjusted, but for what FDR? If I set my p value at 0.05, does that mean my FDR is 5%? I have been told by someone that is the case but, normally, when discussing FDR, q values are reported or just one p value is reported--the threshold for a set FDR. The p.adjust documentation is unclear.

For the R developers, I can understand how one would want to include FDR procedures in p.adjust, but I wonder, given the numerous FDR algorithms now available, if it would be best to formulate an FDR.select function that would be option to p.adjust and itself incorporate more recent FDR procedures than the one proposed by Benjamini and Hochberg in 1995.
(Benjamini himself has a newer one). Some of these may currently be
available as add-on packages but they are not standardized regarding I&O and this makes it difficult for developers to incorporate them into packages such as limma.

So those are my questions and suggestions,


Mark W. Kimpel MD

R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Dec 20 01:04:26 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:02:18 EST