[R] choosing best 'match' for given factor

From: <Murali.Menon_at_avivainvestors.com>
Date: Thu, 31 Mar 2011 15:46:05 +0100


Folks,

I have a 'matching' matrix between variables A, X, L, O:

> a <- structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58,
0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(

    c("A", "X", "L", "O"), c("A", "X", "L", "O")))

> a

      A X L O

A  1.00  0.41  0.58  0.75
X  0.41  1.00  0.60  0.86
L  0.58  0.75  1.00  0.83
O  0.60  0.86  0.83  1.00

And I have a search vector of variables

> v <- c("X", "O")

I want to write a function bestMatch(searchvector, matchMat) such that for each variable in searchvector, I get the variable that it has the highest match to - but searching only among variables to the left of it in the 'matching' matrix, and not matching with any variable in searchvector itself.

So in the above example, although "X" has the highest match (0.86) with "O", I can't choose "O" as it's to the right of X (and also because "O" is in the searchvector v already); I'll have to choose "A".

For "O", I will choose "L", the variable it's best matched with - as it can't match "X" already in the search vector.

My function bestMatch(v, a) will then return c("A", "L")

My matrix a is quite large, and I have a long list of search vectors v, so I need an efficient method.

I wrote this:

bestMatch <- function(searchvector, matchMat) {

        sapply(searchvector, function(cc) {
                             y <- matchMat[!(rownames(matchMat) %in% searchvector) & (index(rownames(matchMat)) < match(cc, rownames(matchMat))), cc, drop = FALSE];
                             rownames(y)[which.max(y)]
        })   

}

Any advice?

Thanks,

Murali



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Mar 2011 - 14:48:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 Mar 2011 - 17:40:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive