Re: [R] PCA analysis

From: Daniel Malter <daniel_at_umd.edu>
Date: Thu, 19 Jun 2008 16:33:15 -0400


Hi Mona, I cannot get it done with the princomp and the biplot commands either (maybe somebody can), but there are always many ways to Rome. This is how you can do it (below). However, the label=rep... below assumes that your values are in order, i.e. that you really want to plot the first fifty rows with one symbol, the second with another, and so forth. If your values are not ordered, you will either have to order your dataset or create a variable that indicates the condition by which you want to group your data and choose the symbols. Assigning this variable for your desired grouping would then most likely involve a loop or a nested ifelse() statement, unless you already have this variable. You then assign your grouping variable to the "pch" argument (for different symbols), the "col" argument (for different colors), or both.

##create data

z<-sample(401:600)
y<-sample(701:900)
x<-sample(1:200)

data.frame(x,y)->df
cbind(df, z)->df

##pc analysis
pc=prcomp(df)

##inspect results
pc
summary(pc)
pc$rotation

##compute pc values for each observation pc.data=t(t(pc$rotation)%*%t(df))
##check
pc.data

##create point labels
label=rep(1:4, each=50)

##plot first PC
##versus second PC
##with label indicated
##by the variable label

plot(pc.data[,1],pc.data[,2],pch=label,col=label ,xlab="First principal component",ylab="Second principal component")

Thank you for your reply.

pch=NA got me rid of the numbers or names of samples that I´m plotting. The problem with how I can replace these with different symbols still remain. I know I can use points to give additional symbols, but I can´t get the rigth values plotted from the outcome of princomp(data). The class of the object is princomp, and I can´t specify which columns should be plotted for the points.

ex (my real dataframe consists multiple(hundreds) colums of data for ca 200 samples):

 z<-sample(401:600)
> y<-sample(701:900)
> x<-sample(1:200)
> data.frame(x,y)->df
> cbind(df, z)->df
> princomp(df)->p
> biplot(p, pch=NA)
> row.names(df)<-1:200

Now I would like for instance all the samples that have row.names under 50 to be plotted in one symbol, the iones from 50-100 in another and so on. Do I need a special function for specifying these different symbols, when my samples are not in a correct order?

As you realize I am quite new with R. Thank you so much for taking your time helping me, I really appreciate it.

Regards, Monna

> From: daniel_at_umd.edu
> To: monnire_at_hotmail.com; r-help_at_r-project.org
> Subject: AW: [R] PCA analysis
> Date: Tue, 17 Jun 2008 19:40:41 -0400
>
> I am not entirely sure after reading your email, but I thought you wanted
to
> do something like this:
>
> ###Start of example
>
> ###create random data for the example
> x=rnorm(100,100,10) ##create Xs
> e=rnorm(100,0,5) ##create Errors
> y=x+e ##create Ys
>
> ###plot
> plot(y~x,pch=NA) ##plot Ys against Xs but suppress all symbols (i.e.
> plot invisibly)
> text(y~x,labels=round(x),pch=NULL) ##use values of X (rounded to its
integer
> value) as symbols for the X-Y plot
>
> ###End of example
>
> So you could just substitute your variable names for x and y in the plot()
> and text() commands. Let us know whether your problem is solved.
>
> Cheers,
> Daniel
>
> -------------------------
> cuncta stricte discussurus
> -------------------------
>
> -----Ursprüngliche Nachricht-----
> Von: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] Im
> Auftrag von Monna Nygård
> Gesendet: Tuesday, June 17, 2008 5:04 AM
> An: r-help_at_r-project.org
> Betreff: [R] PCA analysis
>
>
> Hi,
>
> I have a problem with making PCA plots that are readable.
> I would like to set different sympols instead of the numbers of my samples
> or their names, that I get plotted (xlabs).
> How is this possible? With points, i don4t seem to get the right data
> plotted onto the PCA plot, as I do not quite understand from where it is
> taken. I dont know how to plot the correct columns of the prcomp outcome
> (p).
> I would really appreciate if someone could help me, I have struggled with
> this for days now. How can I make a function that gives different symbols
> for the points, depending on how big the number given to it as xlabs is?
>
> Making the plots.
>
> read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t",
> header=T)->boutbout <-bout[-1]p <- prcomp(bout) biplot(p, choices =
c(2,3),
> scale = 1, pc.biplot = FALSE, var.axes = F, ylabs = NULL,
>
xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89"," >
135","215","330","409","470","51","80","119","175","222","301","422","280"," >
171","256","243","404","37","157","28","187","70","42","283","261","85","147 >
","204","235","411","514","77","204","87","366","306","351","371","38","534" >
,"199","407","42","167","480","195","22","35","80","433","43","109","214","3 >
63","292","61","115","178","273","521","72","126","253","288","501","83","11 >
3","250","359","498","19","130","389","324","24","58","124","388","319","164 >
","101","153","383","345","219","179","161","375","298","450","555","439","5 >
4","54","490","465","411","18","85","503","455","394","179","187","416","447 >
","219","461","164","366","474","167","236","507","319","509","467","507","4
> 50","359","507","192","453","101","456","512","517"), cex=0.67,
> main="90-700bp")
>
> _________________________________________________________________
> [[elided Hotmail spam]]
>
> PLink
> [[alternative HTML version deleted]]
>
>




Senaste kändisnyheterna & hetaste skvallret! MSN Kändisnytt

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 19 Jun 2008 - 21:57:07 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 19 Jun 2008 - 22:30:55 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive