Re: [R] One-to-one matching?

From: <Alec.Zwart_at_csiro.au>
Date: Tue, 24 Jun 2008 15:40:49 +1000


My thanks to Gabor Grothendieck, Charles C. Berry and Moshe Olshansky for their suggested solutions.

The upshot of which is that a nice one-line solution to my one-to-one exact matching problem is the Grothendieck-Berry collaboration of

   match(make.unique(matchSample), make.unique(lookupTable))

I've settled on this particular solution as it appears to be the fastest of the three possibilities given, although Moshe's solution comes a close second :-)

Many thanks...

Alec

On Sun, Jun 22, 2008 at 10:57 PM, <Alec.Zwart_at_csiro.au> wrote:
> Hi folks,
>
> Can anyone suggest an efficient way to do "matching without
> replacement", or "one-to-one matching"? pmatch() doesn't quite
> provide what I need...
>
> For example,
>
> lookupTable <- c("a","b","c","d","e","f")
> matchSample <- c("a","a","b","d")
> ##Normal match() behaviour:
> match(matchSample,lookupTable)
> [1] 1 1 2 4
>
> My problem here is that both "a"s in matchSample are matched to the
> same "a" in the lookup table. I need the elements of the lookup table

> to be excluded from the table as they are matched, so that no match
> can be found for the second "a".
>
> Function pmatch() comes close to what I need:
>
> pmatch(matchSample,lookupTable)
> [1] 1 NA 2 4
>
> Yep! However, pmatch() incorporates partial matching, which I
> definitely don't want:
>
> lookupTable <- c("a","b","c","d","e","aaaaaaaaf")
> matchSample <- c("a","a","b","d")
> pmatch(matchSample,lookupTable)
> [1] 1 6 2 4
> ## i.e. the second "a", matches "aaaaaaaaf" - I don't want this.
>
> Of course, when identical items ARE duplicated in both sample and
> lookup table, I need the matching to reflect this:
>
> lookupTable <- c("a","a","c","d","e","f")
> matchSample <- c("a","a","c","d")
> ##Normal match() behaviour
> match(matchSample,lookupTable)
> [1] 1 1 3 4
>
> No good - pmatch() is better:
>
> lookupTable <- c("a","a","c","d","e","f")
> matchSample <- c("a","a","c","d")
> pmatch(matchSample,lookupTable)
> [1] 1 2 3 4
>
> ...but we still have the partial matching issue...
>
> ##And of course, as per the usual behaviour of match(), sample
> elements missing from the lookup table should return NA:
>
> matchSample <- c("a","frog","e","d") ; print(matchSample)
> match(matchSample,lookupTable)
>
> Is there a nifty way to get what I'm after without resorting to a for
> loop? (my code's already got too blasted many of those...)
>
> Thanks,
>
> Alec Zwart
> CMIS CSIRO
> alec.zwart_at_csiro.au
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 24 Jun 2008 - 05:45:32 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 24 Jun 2008 - 09:30:56 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive