Re: [R] data frame question

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Sun, 10 Feb 2008 20:52:03 +0000 (UTC)

joseph <jdsandjd_at_yahoo.com> wrote in
news:109232.80965.qm_at_web36905.mail.mud.yahoo.com:

> I have 2 data frames df1 and df2. I would like to create a
> new data frame new_df which will contain only the common rows based
> on the first 2 columns (chrN and start). The column score in the new
> data frame should
> be replaced with a column containing the average score

> (average_score) from df1 and df2.
>

> df1= data.frame(chrN= c("chr1", "chr1", "chr1", "chr1", "chr2",
> "chr2", "chr2"),
> start= c(23, 82, 95, 108, 95, 108, 121),
> end= c(33, 92, 105, 118, 105, 118, 131),
> score= c(3, 6, 2, 4, 9, 2, 7))
>
> df2= data.frame(chrN= c("chr1", "chr2", "chr2", "chr2" , "chr2"),
> start= c(23, 50, 95, 20, 121),
> end= c(33, 60, 105, 30, 131),
> score= c(9, 3, 7, 7, 3))

Clunky to be sure, but this should worked for me:

df3 <- merge(df1,df2,by=c("chrN","start") #non-match variables get auto-relabeled

df3$avg.scr <- with(df3, (score.x+score.y)/2) # or mean( ) df3 <- df3[,c("chrN","start","avg.scr")] #drops the variables not of interest

df3
  chrN start avg.scr

1 chr1    23       6
2 chr2   121       5
3 chr2    95       8

-- 
David Winsemius

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 10 Feb 2008 - 20:59:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 10 Feb 2008 - 22:30:13 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive