Re: [R] Compare two data sets

From: jim holtman <jholtman_at_gmail.com>
Date: Tue, 25 Mar 2008 21:37:06 -0500

Here is one way to find the common rows. You can then use the 'keys' gotten back to reconstruct a new data frame:

> f1 <- read.table(textConnection("V1 V2

+ YBL064C YBR067C
+ YBL064C YBR204C
+ YBL064C YDR368W
+ YBL064C YJL067W
+ YBL064C YPR160W
+ YBR053C YGL089C
+ YBR053C YHR113W
+ YBR053C YNL328C"), header=TRUE)

>
> f2 <- read.table(textConnection("V1 V2
+ YBL064C YBR067C

+ YBL064C YBR204C
+ YBL064C YDR368W"), header=TRUE)
>
> f1$key <- paste(f1$V1, f1$V2)
> f2$key <- paste(f2$V1, f2$V2)
>
> # now find the ones in common
> intersect(f1$key, f2$key)

[1] "YBL064C YBR067C" "YBL064C YBR204C" "YBL064C YDR368W"
>

On Tue, Mar 25, 2008 at 9:18 PM, Suhaila Zainudin <suhaila.zainudin_at_gmail.com> wrote:
> Hi,
>
> I have a similar query (how to compare 2 datasets), but my dataset is a bit
> different.
> I want to compare each data in dataset 1 to data in dataset 2 and get the
> data which is common to both datasets.
>
> For example;
>
> I have a a file (named mysample).
>
> V1 V2
> YBL064C YBR067C
> YBL064C YBR204C
> YBL064C YDR368W
> YBL064C YJL067W
> YBL064C YPR160W
> YBR053C YGL089C
> YBR053C YHR113W
> YBR053C YNL328C
>
> And I have another file (myref) as follows
>
> V1 V2
> YBL064C YBR067C
> YBL064C YBR204C
> YBL064C YDR368W
>
>
> When I try to intersect the two files, I received NULL data frames.
>
> > intersect(myref,mysample)
> NULL data frame with 0 rows
>
> What I am hoping to get out of intersect for the above files are
>
> YBL064C YBR067C
> YBL064C YBR204C
> YBL064C YDR368W
>
> Are there any R functions that can achieve what I want to do?
> Or should I merge the data which is currently in 2 columns into single
> column and use intersect again?
>
> Thanks for any feedbacks!
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 26 Mar 2008 - 05:38:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 26 Mar 2008 - 08:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive