Re: [R] Matched pairs with two data frames

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Fri, 18 Apr 2008 13:29:16 +0000 (UTC)

Udo <ukoenig_at_med.uni-marburg.de> wrote in news:1208462659.4807ad43cea9d_at_webmail.med.uni-marburg.de:

> Daniel,
> thank you!
>
> I want to perfrom the simplest way of matching:
> a one-to-one exact match (by age and school):
> for every case in "treat" find ONE case (if there is one) in
> "control" . The cases in "control" that could be matched, should be
> tagged as not available or taken away (deleted) from the control
> pool (thus, the used ones are not replaced).
>
> #treatment group
> treat <- data.frame(age=c(1,1,2,2,2,4),
> school=c(10,10,20,20,20,11),
> out1=c(9.5,2.3,3.3,4.1,5.9,4.6))
>
> #control group
> control <- data.frame(age=c(1,1,1,1,3,2),
> school=c(10,10,10,10,33,20),
> out2=c(1.1,2,3.5,4.9,5.2,6.5))
>
> #one-to-one exat matching-alorithmus ????
>
> matched.data.frame <- ?????
>
> In my example I matched the cases "by hand" to make things clear.
> Case 1 from "treat" was matched with case 1 from "control",
> 2 with 2 and 3 with 6. Case 4, 5 and 6 could not be matched,
> because there is no "partner" in "control" .
> Thus my matched example data frame has 3 cases.

Is it really the case that SPSS would give the output that you describe without any warnings about non-uniqueness? How could they live with themselves after such arbitrary behavior? This link is evidence that SPSS may not behave as you allege.
<http://kb.iu.edu/data/afit.html>

If you really want to persist in what cannot possibly be called "one- -one exact matching", but instead "arbitrary convenience matching", then you need to construct a function that sequentially marches through "treat", grabs the first match (perhaps with something like):

> matched.first <- merge(treat[1,],control, by= c("age","school"))[1,]
> matched.first

  age school out1 out2
1 1 10 9.5 1.1

... except that the "1"'s would be replaced with an index variable, then mark that control as "taken" perhaps by using all of the variables as identifiers, and then attempt match/marking for each successive case among ("taken" == FALSE") controls.

-- 
David Winsemius

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 18 Apr 2008 - 13:32:07 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 21 Apr 2008 - 07:30:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive