On Fri, Apr 29, 2011 at 11:17:58PM -0700, adigs wrote:

> Apologies for what's probably quite simple, but I'm having some problems with

*> sorting a data frame by the number of occurences of each level of a factor.
**>
**> df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g'))
**>
**> I want to sort the dataframe so that the values of df$name that occur most
**> often are at the bottom - ie. in the order:
**>
**> attributes(sort(summary(df$name)))$name = "e" "f" "g" "c" "d" "a" "b":
**>
**> > sort(summary(df$name))
**> e f g c d a b
**> 1 1 1 3 3 5 6
**>
**> So the desired result is:
**>
**> id name
**> 8 e
**> 17 f
**> 20 g
**> 4 c
**> 11 c
**> 19 c
**> 6 d
**> 9 d
**> 10 d
**> 1 a
**> 5 a
**> 12 a
**> 14 a
**> 15 a
**> 2 b
**> 3 b
**> 7 b
**> 13 b
**> 16 b
**> 18 b
*

Hi.

Try the following

freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum) df[order(freq, df$name), ]

Hope this helps.

Petr Savicky.

