Re: [R] Plot Principal component analysis

From: Thibaut Jombart <jombart_at_biomserv.univ-lyon1.fr>
Date: Wed, 27 Feb 2008 15:42:16 +0100

Jim Lemon wrote:

>SNN wrote:
>
>
>>Hi,
>>
>>I have matrix of 300,000*115 (snps*individual). I ran the PCA on the
>>covariance matrix which has a dimention oof 115*115. I have the first 100
>>individuals from group A and the rest of 15 individuals from group B. I need
>>to plot the data in two and 3 dimentions with respect to PC1 and PC2 and (in
>>3D with respect to PC1, PC2 and PC3). I do not know how to have the plot
>>ploting the first 100 points corresponding to group A in red (for example)
>>and the rest of the 15 points in Blue? i.e I want the each group in a
>>diffrent color in the same plot. I appreciate if someone can help.
>>
>>
>>
>Hi Nancy,
>(if indeed you are a Nancy and that is not a webnym)
>Say that your groups really are coded "A" and "B", the group coding
>variable is called "group". You can define a color vector like this:
>
>colorvector<-ifelse(group=="A","red","blue")
>
>Jim
>
>______________________________________________
>R-help_at_r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
Hi Nancy,
in your case you may also use inertia ellipses to represent your groups, in addition to different colors.
Here is an example using a microsatellite dataset from adegenet (but valid for SNPs of course):
#####
library(ade4)
library(adegenet)
data(microbov) # dataset

# replace missing values
obj=na.replace(microbov,method="mean")

# perform your pca, keep 3 axes
pca1=dudi.pca(obj$tab,scannf=FALSE,nf=3,scale=FALSE)

# plot the result
s.class(pca1$li,obj$pop)
s.class(pca1$li,obj$pop,col=sample(colors(),15)) # here, replace "col" by the appropriate vector of colors.

#####
The resulting graphic represents each genotype by a point, and adds ellipses of different color for each group; each ellipse represents 95 % of the inertia of the corresponding group. The more ellipses overlap, the less your groups are differentiated on the factorial plane.

Cheers,

Thibaut.

-- 
######################################
Thibaut JOMBART
CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive
Universite Lyon 1
43 bd du 11 novembre 1918
69622 Villeurbanne Cedex
Tél. : 04.72.43.29.35
Fax : 04.72.43.13.88
jombart_at_biomserv.univ-lyon1.fr
http://lbbe.univ-lyon1.fr/-Jombart-Thibaut-.html?lang=en
http://pbil.univ-lyon1.fr/software/adegenet/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 27 Feb 2008 - 14:55:12 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 27 Feb 2008 - 15:30:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive