From: Matthew Dowle <mdowle_at_mdowle.plus.com>

Date: Thu, 12 May 2011 16:23:51 +0100

DT[order(-a,b)] # order by a descending then by b ascending, if a and b are both numeric

Date: Thu, 12 May 2011 16:23:51 +0100

With data.table, the following is routine :

DT[order(a)] # ascending DT[order(-a)] # descending, if a is numeric DT[a>5,sum(z),by=c][order(-V1)] # sum of z group by c, just where a>5,then show me the largest first

DT[order(-a,b)] # order by a descending then by b ascending, if a and b are both numeric

It avoids peppering your code with $, and becomes quite natural after a short while; especially compound queries such as the 3rd example.

Matthew

http://datatable.r-forge.r-project.org/

"Ivan Calandra" <ivan.calandra_at_uni-hamburg.de> wrote in message
news:4DCBEC8B.6040806_at_uni-hamburg.de...

I was wondering whether it would be possible to make a method for
data.frame with sort().

I think it would be more intuitive than using the complex construction
of df[order(df$a),]

Is there any reason not to make it?

Ivan

Le 5/12/2011 15:40, Marc Schwartz a écrit :

> On May 12, 2011, at 8:09 AM, John Kane wrote:

*>
**>> Argh. I knew it was at least partly obvious. I never have been able to
**>> read the order() help page and understand what it is saying.
**>>
**>> THanks very much.
**>>
**>> By the way, to me it is counter-intuitive that the the command is
**>>
**>>> df1[order(df1[,2],decreasing=TRUE),]
**>> For some reason I keep expecting it to be
**>> order( , df1[,2],decreasing=TRUE)
**>>
**>> So clearly I don't understand what is going on but at least I a lot
**>> better off. I may be able to get this graph to work.
**>
**> John,
**>
**> Perhaps it may be helpful to understand that order() does not actually
**> sort() the data.
**>
**> It returns a vector of indices into the data, where those indices are the
**> sorted ordering of the elements in the vector, or in this case, the
**> column.
**>
**> So you want the output of order() to be used within the brackets for the
**> row *indices*, to reflect the ordering of the column (or columns in the
**> case of a multi-level sort) that you wish to use to sort the data frame
**> rows.
**>
**> set.seed(1)
**> x<- sample(10)
**>
**>> x
**> [1] 3 4 5 7 2 8 9 6 10 1
**>
**>
**> # sort() actually returns the sorted data
**>> sort(x)
**> [1] 1 2 3 4 5 6 7 8 9 10
**>
**>
**> # order() returns the indices of 'x' in sorted order
**>> order(x)
**> [1] 10 5 1 2 3 8 4 6 7 9
**>
**>
**> # This does the same thing as sort()
**>> x[order(x)]
**> [1] 1 2 3 4 5 6 7 8 9 10
**>
**>
**> set.seed(1)
**> df1<- data.frame(aa = letters[1:10], bb = rnorm(10))
**>
**>> df1
**> aa bb
**> 1 a -0.6264538
**> 2 b 0.1836433
**> 3 c -0.8356286
**> 4 d 1.5952808
**> 5 e 0.3295078
**> 6 f -0.8204684
**> 7 g 0.4874291
**> 8 h 0.7383247
**> 9 i 0.5757814
**> 10 j -0.3053884
**>
**>
**> # These are the indices of df1$bb in sorted order
**>> order(df1$bb)
**> [1] 3 6 1 10 2 5 7 9 8 4
**>
**>
**> # Get df1$bb in increasing order
**>> df1$bb[order(df1$bb)]
**> [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884 0.1836433 0.3295078
**> [7] 0.4874291 0.5757814 0.7383247 1.5952808
**>
**>
**> # Same thing as above
**>> sort(df1$bb)
**> [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884 0.1836433 0.3295078
**> [7] 0.4874291 0.5757814 0.7383247 1.5952808
**>
**>
**> You can't use the output of sort() to sort the data frame rows, so you
**> need to use order() to get the ordered indices and then use that to
**> extract the data frame rows in the sort order that you desire:
**>
**>> df1[order(df1$bb), ]
**> aa bb
**> 3 c -0.8356286
**> 6 f -0.8204684
**> 1 a -0.6264538
**> 10 j -0.3053884
**> 2 b 0.1836433
**> 5 e 0.3295078
**> 7 g 0.4874291
**> 9 i 0.5757814
**> 8 h 0.7383247
**> 4 d 1.5952808
**>
**>
**>> df1[order(df1$bb, decreasing = TRUE), ]
**> aa bb
**> 4 d 1.5952808
**> 8 h 0.7383247
**> 9 i 0.5757814
**> 7 g 0.4874291
**> 5 e 0.3295078
**> 2 b 0.1836433
**> 10 j -0.3053884
**> 1 a -0.6264538
**> 6 f -0.8204684
**> 3 c -0.8356286
**>
**>
**> Does that help?
**>
**> Regards,
**>
**> Marc Schwartz
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

-- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra_at_uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.phpReceived on Thu 12 May 2011 - 15:27:16 GMT______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Thu 12 May 2011 - 16:00:06 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*