Re: [R] subsetting a dataframe

From: William Dunlap <wdunlap_at_tibco.com>
Date: Fri, 04 Jun 2010 14:21:27 -0700

> -----Original Message-----
> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of yjmha69
> Sent: Friday, June 04, 2010 12:28 PM
> To: R-help_at_r-project.org
> Subject: [R] subsetting a dataframe
>
> Hi there,
> > a<-data.frame(c(1,2,2,5,9,9),c("A","B","C","D","E","F"))
> > names(a)<-c("x1","x2")
> > max(table(a$x1))
> [1] 2
> >
> The above shows the max count for x1 is 2, which is correct.
> But we can't tell
> there are 2 groups that meet this criteria: 2,2 and 9,9.
> I then want to extract the records that has the hightest count
> > a[max(table(a$x1)),]
>   x1 x2
> 2  2  B
> This is not working, since it is equvalent to a[2,]
> What I want is
>    x1 x2
> 2 2   B
> 3 2   C
> 5 9   E
> 5 9   F
>
> I think this should be very easy, but I'm a beginner :-)

One way is to use merge to combine your table with the original data.frame:
  > tmp <- merge(a, as.data.frame(table(a$x1)), by.x="x1", by.y="Var1")   > tmp[tmp$Freq==max(tmp$Freq), ,drop=FALSE] Another way is to use ave:
  > tmp <- with(a, ave(x1, x1, FUN=length))   > a[tmp==max(tmp), , drop=FALSE]
Another way is
  > a[as.character(a$x1) %in% names(tmp[tmp==max(tmp)]), , drop=FALSE] (This last one might have problems when some values in a$x1 are different but so close that as.character() makes the same string out of them.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

>
> Thanks
>
> YJM
>
>
>
>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 04 Jun 2010 - 21:24:42 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 04 Jun 2010 - 21:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive