Re: [R] Matched pairs with two data frames

From: David Winsemius <>
Date: Fri, 18 Apr 2008 13:29:16 +0000 (UTC)

Udo <> wrote in

> Daniel,
> thank you!
> I want to perfrom the simplest way of matching:
> a one-to-one exact match (by age and school):
> for every case in "treat" find ONE case (if there is one) in
> "control" . The cases in "control" that could be matched, should be
> tagged as not available or taken away (deleted) from the control
> pool (thus, the used ones are not replaced).
> #treatment group
> treat <- data.frame(age=c(1,1,2,2,2,4),
> school=c(10,10,20,20,20,11),
> out1=c(9.5,2.3,3.3,4.1,5.9,4.6))
> #control group
> control <- data.frame(age=c(1,1,1,1,3,2),
> school=c(10,10,10,10,33,20),
> out2=c(1.1,2,3.5,4.9,5.2,6.5))
> #one-to-one exat matching-alorithmus ????
> <- ?????
> In my example I matched the cases "by hand" to make things clear.
> Case 1 from "treat" was matched with case 1 from "control",
> 2 with 2 and 3 with 6. Case 4, 5 and 6 could not be matched,
> because there is no "partner" in "control" .
> Thus my matched example data frame has 3 cases.

Is it really the case that SPSS would give the output that you describe without any warnings about non-uniqueness? How could they live with themselves after such arbitrary behavior? This link is evidence that SPSS may not behave as you allege.

If you really want to persist in what cannot possibly be called "one- -one exact matching", but instead "arbitrary convenience matching", then you need to construct a function that sequentially marches through "treat", grabs the first match (perhaps with something like):

> matched.first <- merge(treat[1,],control, by= c("age","school"))[1,]
> matched.first

  age school out1 out2
1 1 10 9.5 1.1

... except that the "1"'s would be replaced with an index variable, then mark that control as "taken" perhaps by using all of the variables as identifiers, and then attempt match/marking for each successive case among ("taken" == FALSE") controls.

David Winsemius

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 18 Apr 2008 - 13:32:07 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 21 Apr 2008 - 07:30:32 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive