[R] setting pairwise comparisons of columns

From: Louis Plough <lplough_at_usc.edu>
Date: Wed, 13 Apr 2011 12:32:24 -0700


Hi,
I have a number of genes (columns) for which I want to examine pairwise associations of genotypes (each row is an individual)...For example (see data below), I would like to compare M1 to M2, M2 to M3, and M1 to M3 (i.e. does ac from M1 tend to be found with bc from M2 more often than expected.)  Down stream I will be performing chi square tests for each pair.

But I am looking for a way to set all pairs of genes (order doesn't matter, so with 3 genes, there are 3 comparisons, 4 genes=6 comparisons) in a new data.frame or matrix so that I can then test each pair with a chi-square test in a loop.

Below is some sample data of the form I will be using.

> lets<-c("ab","ac","bc","bd")
> epi<-data.frame(cbind("M1"= c(sample(lets,10,
replace=TRUE)),"M2"=c(sample(lets,10,replace=TRUE)), "M3"=c(sample(lets,10, replace=TRUE))))
> print(epi)

   M1 M2 M3
1 ac bc bd
2 ac ac bd
3 bd bd bd
4 ab ac bd
5 ac bc bd
6 bd bd bc
7 ab ac ab
8 bc bd ab
9 bd ab ac
10 bc bc bd

I tried a for loop to set each column against the others, but get errors for undefined columns selected:

for(i in 1:3) {
 k=i+1
j=k
for(j in k:3){
 epi3=cbind("A"=epi[,i],"B"=epi[,j])

print(epi3)
}
}
>

  A B
1 ac bc
2 ac ac
3 bd bd
4 ab ac
5 ac bc
6 bd bd
7 ab ac
8 bc bd
9 bd ab
10 bc bc

    A B
1 ac bd
2 ac bd
3 bd bd
4 ab bd
5 ac bd
6 bd bc
7 ab ab
8 bc ab
9 bd ac
10 bc bd

    A B
1 bc bd
2 ac bd
3 bd bd
4 ac bd
5 bc bd
6 bd bc
7 ac ab
8 bd ab
9 ab ac
10 bc bd
Error in `[.data.frame`(epi, , j) : undefined columns selected

I get the output in the right format, but with errors, and the actual data frame epi 3, has only one column,

Im sure this is a simple fix...any ideas? Could I use combn instead?

Louis

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 14 Apr 2011 - 06:22:06 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Apr 2011 - 06:30:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive