Re: [R] fast way to compare two matrices of combinations

From: Patrick Burns <pburns_at_pburns.seanet.com>
Date: Thu, 13 Mar 2008 16:37:16 +0000

One thing that will probably speed things enormously is to not grow objects (all.triplets, etc.). Instead create them to be roughly the right size and do something like double their size if they get full.

Patrick Burns
patrick_at_burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Mark W Kimpel wrote:

>I have a list (length 750), each element containing a vector of unique
>strings (unique gene ids), with length up to ~40 (median 15). I want to
>compile a matrix of all possible triplets and their frequency within
>gene elements. Using combn and a lot of looping, I am accomplishing this
>but it is VERY slow.
>
>I've tried to figure out a way to vectorize this, using "match" and
>"%in%", but can't get my mind around it.
>
>Below is my code. sig.tf.pairs is the list. Suggestions?
>
>Mark
>
>
>############################################################
>M <- 3 # 3 for triplets, etc.
>##########################################################
># count all triplets
>all.triplets <- NULL
>all.count.vec <- NULL
>for (i in 1:length(sig.tf.pairs)){
> if (length(sig.tf.pairs[[i]] >= M)){
> triplets <- combn(sig.tf.pairs[[i]], M, simplify = TRUE)
> for (j in 1:ncol(triplets)){
> o <- order(triplets[,j])
> triplets[,j] <- triplets[o,j]
> count.vec <- rep(1, ncol(triplets))
> }
> if (is.null(all.count.vec)){
> all.count.vec <- count.vec
> all.triplets <- triplets
> } else {
> redundant.vec <- NULL
> for (k in 1:ncol(all.triplets)){
> for (m in 1:ncol(triplets)){
> if (length(intersect(triplets[,m], all.triplets[,k] == M))){
> all.count.vec[k] <- all.count.vec[k] + 1
> redundant.vec <- c(redundant.vec, m)
> }
> }
> }
> if(!is.null(redundant.vec)){
> triplets <- triplets[,-redundant.vec]
> count.vec <- count.vec[,-redundant.vec]
> }
> all.triplets <- cbind(all.triplets, triplets)
> all.count.vec <- c(all.count.vec, count.vec)
> }
> }
>}
>###################################
>
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 13 Mar 2008 - 16:40:21 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 13 Mar 2008 - 17:30:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive