Re: [R] merging/intersecting 2 data frames

From: Greg Snow <Greg.Snow_at_imail.org>
Date: Tue, 29 Jun 2010 15:48:09 -0600

Use the merge function, look at the by.x and by.y arguments, also look at the all.x and all.y arguments as well as the suffixes argument. You may need to delete some columns after the merge (or replace missing values in one column with those in the same location from the next column, see the ifelse function). So it may take a couple steps, but that is probably the most straight forward.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow_at_imail.org
801.408.8111



> -----Original Message-----
> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-
> project.org] On Behalf Of Erin Hodgess
> Sent: Tuesday, June 29, 2010 1:22 PM
> To: R help
> Subject: [R] merging/intersecting 2 data frames
>
> Dear R People:
>
> I have two data frames, a.df and b.df as seen here:
>
> > a.df[1:10,]
> DATE GENDER PATIENT_ID AGE SYNDROME
> 1 4/16/2009 F 23686 45 RASH ON BODY
> 2 4/16/2009 F 13840 35 CANT URINATE
> 3 4/16/2009 M 12895 30 BLURRED VISION
> 4 4/16/2009 M 18375 33 UNABLE TO VOID
> 5 4/16/2009 M 2237 44 SOB WEAKNESS
> 6 4/16/2009 F 21484 41 TOOTH PAINTOOTH PAIN
> 7 4/16/2009 M 10783 37 RT ARM PAIN
> 8 4/16/2009 M 12610 65 L FOOT INJURY
> 9 4/16/2009 F 3495 29 URINARY DIFFICULTIES
> 10 4/16/2009 F 351 36 PT STS MVA
> > b.df[1:10,]
> DATE_OF_DEATH ID
> 1 4/19/2009 21676
> 2 4/19/2009 13717
> 3 4/19/2009 20498
> 4 4/19/2009 14281
> 5 4/19/2009 38848
> 6 4/20/2009 331
> 7 4/20/2009 4084
> 8 4/20/2009 19616
> 9 4/20/2009 17965
> 10 4/20/2009 11863
> >
>
> a.df will always be larger than b.df.
>
> I want to create a third data frame that is matched on PATIENT_ID from
> a.df and ID from b.df.
>
> If there is no match from a.df$PATIENT_ID to b.df$ID, then we omit the
> row from the new data.frame.
>
> If there is a match, we include the DATE_OF_DEATH column from b.df.
>
> I've tried all kinds of tricks, but nothing works exactly as I wish.
>
> Thanks in advance,
> Sincerely,
> Erin
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodgess_at_gmail.com
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Tue 29 Jun 2010 - 21:51:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 29 Jun 2010 - 22:50:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive