[R] PCA in Microarrays

From: Jorge Ivan Velez <jorgeivanvelez_at_gmail.com>
Date: Wed, 14 May 2008 10:27:59 -0400


Dear useRs:
I'm not sure if it's the correct place to ask but I'll try it out. I've been reading about how to perform Principal Component Analysis (PCA) in microarrays (see [1]) and there's something that I don't get it. Basically it's related with performing PCA over data sets which number of variables is greater than the number of samples. For example in the paper mentioned above, the number of variables (genes) and samples (tumors) is 8538 and 104, respectively. My understanding is that, in PCA, the number of samples (n) must be greater than the number of variables (p) and its goal is to seek k components, such as k<p and the variance in this new data set be
maximized. Am I wrong? Could somebody please tell me how is possible to perform PCA when the number of variables is greater than the number of samples and how to do it in R? I'm really confused. In R I've tried "prcomp" and "princomp" but they didn't work.

I'm using Win XP SP2, Intel Core- 2 Duo 2.4 GHz and R 2.7.0 Patched.

Thanks in advance,

Jorge Ivan Velez

[1] Ringnér, M. What is principal components analysis? Nature Biotechnology  26, 303 - 304 (2008),
http://www.nature.com/nbt/journal/v26/n3/full/nbt0308-303.html

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 14 May 2008 - 18:19:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 14 May 2008 - 19:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive