Re: [R] Merging files function

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Fri 02 Jun 2006 - 00:39:18 EST

On Thu, 2006-06-01 at 09:16 -0400, Chuck Cleland wrote:
> Ahamarshan jn wrote:
> > hi list,
> >
> > This question must be very basic but I am just 3 days
> > old to R, so I think i can ask. I am trying to find a
> > function to merge two
> > tables of data in two different files as one.
> >
> > Does merge function only fills in colums between two
> > table where data is missing or is there a way that
> > merge can be used to merge data between two matrixes
> > with common dimensions.
<snip />

Hi Ahamarshan,

I asked a similar question recently. Chuck's email provides a solution, to which I'll add a comment and a link to the discussions I had with Marc Schwartz and Sundar Dorai-Raj.

If your real world use is more complicated than your example, then you'll need a slightly different strategy. If you have matrices with different rows, such as,

# alter Chuck's example to have one df with 5 rows the other with 4

df1 <- as.data.frame(matrix(rnorm(20), ncol=4))
df2 <- as.data.frame(matrix(rnorm(20), ncol=5))
names(df1) <- paste("v", 1:4, sep="")
names(df2) <- paste("x", 1:5, sep="")

row.names(df1) <- paste("h", 1:5, sep="") row.names(df2) <- paste("h", 1:4, sep="")

Now if you merge this, merge() gives you a result with only 4 rows.

merge(df1, df2, by="row.names")
## lost a row, now with all rows:
merge(df1, df2, by="row.names", all = TRUE)

So use all = TRUE if row sizes differ.

For more complicated merges, you might check out the replies I got from Marc and Sundar in the following thread:

http://thread.gmane.org/gmane.comp.lang.r.general/63031/focus=63042

Finally, why did you post your message to the list twice, with different subject lines?

HTH, G

>
> See ?merge.
>
> df1 <- as.data.frame(matrix(rnorm(20), ncol=4))
> df2 <- as.data.frame(matrix(rnorm(20), ncol=4))
> names(df1) <- paste("v", 1:4, sep="")
> names(df2) <- paste("x", 1:4, sep="")
> row.names(df1) <- paste("h", 1:5, sep="")
> row.names(df2) <- paste("h", 1:5, sep="")
>
> newdf <- merge(df1, df2, by="row.names")

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
*  Note new Address, Telephone & Fax numbers from 6th April 2006  *
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson
ECRC & ENSIS                  [t] +44 (0)20 7679 0522
UCL Department of Geography   [f] +44 (0)20 7679 0565
Pearson Building              [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street                  [w] http://www.ucl.ac.uk/~ucfagls/cv/
London, UK.                   [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Jun 02 02:27:03 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 02 Jun 2006 - 04:10:27 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.