Re: [R] Sorting dataframe by number of occurrences of factor

From: Petr Savicky <savicky_at_praha1.ff.cuni.cz>
Date: Sat, 30 Apr 2011 11:00:59 +0200

On Fri, Apr 29, 2011 at 11:17:58PM -0700, adigs wrote:
> Apologies for what's probably quite simple, but I'm having some problems with
> sorting a data frame by the number of occurences of each level of a factor.
>
> df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g'))
>
> I want to sort the dataframe so that the values of df$name that occur most
> often are at the bottom - ie. in the order:
>
> attributes(sort(summary(df$name)))$name = "e" "f" "g" "c" "d" "a" "b":
>
> > sort(summary(df$name))
> e f g c d a b
> 1 1 1 3 3 5 6
>
> So the desired result is:
>
> id name
> 8 e
> 17 f
> 20 g
> 4 c
> 11 c
> 19 c
> 6 d
> 9 d
> 10 d
> 1 a
> 5 a
> 12 a
> 14 a
> 15 a
> 2 b
> 3 b
> 7 b
> 13 b
> 16 b
> 18 b

Hi.

Try the following

  freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum)   df[order(freq, df$name), ]

Hope this helps.

Petr Savicky.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 30 Apr 2011 - 09:04:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 30 Apr 2011 - 17:40:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive