From: Petr Savicky <savicky_at_praha1.ff.cuni.cz>

Date: Sat, 30 Apr 2011 11:00:59 +0200

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 30 Apr 2011 - 09:04:27 GMT

Date: Sat, 30 Apr 2011 11:00:59 +0200

On Fri, Apr 29, 2011 at 11:17:58PM -0700, adigs wrote:

> Apologies for what's probably quite simple, but I'm having some problems with

*> sorting a data frame by the number of occurences of each level of a factor.
**>
**> df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g'))
**>
**> I want to sort the dataframe so that the values of df$name that occur most
**> often are at the bottom - ie. in the order:
**>
**> attributes(sort(summary(df$name)))$name = "e" "f" "g" "c" "d" "a" "b":
**>
**> > sort(summary(df$name))
**> e f g c d a b
**> 1 1 1 3 3 5 6
**>
**> So the desired result is:
**>
**> id name
**> 8 e
**> 17 f
**> 20 g
**> 4 c
**> 11 c
**> 19 c
**> 6 d
**> 9 d
**> 10 d
**> 1 a
**> 5 a
**> 12 a
**> 14 a
**> 15 a
**> 2 b
**> 3 b
**> 7 b
**> 13 b
**> 16 b
**> 18 b
*

Hi.

Try the following

freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum) df[order(freq, df$name), ]

Hope this helps.

Petr Savicky.

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 30 Apr 2011 - 09:04:27 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 30 Apr 2011 - 17:40:34 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*