Re: [R] Simple order() data frame question.

From: Matthew Dowle <mdowle_at_mdowle.plus.com>
Date: Thu, 12 May 2011 16:23:51 +0100

With data.table, the following is routine :

DT[order(a)]   # ascending
DT[order(-a)]  # descending, if a is numeric
DT[a>5,sum(z),by=c][order(-V1)]   # sum of z group by c, just where a>5, 
then show me the largest first
DT[order(-a,b)] # order by a descending then by b ascending, if a and b are both numeric

It avoids peppering your code with $, and becomes quite natural after a short while; especially compound queries such as the 3rd example.

Matthew

http://datatable.r-forge.r-project.org/

"Ivan Calandra" <ivan.calandra_at_uni-hamburg.de> wrote in message news:4DCBEC8B.6040806_at_uni-hamburg.de...
I was wondering whether it would be possible to make a method for data.frame with sort().
I think it would be more intuitive than using the complex construction of df[order(df$a),]
Is there any reason not to make it?

Ivan

Le 5/12/2011 15:40, Marc Schwartz a écrit :
> On May 12, 2011, at 8:09 AM, John Kane wrote:
>
>> Argh. I knew it was at least partly obvious. I never have been able to
>> read the order() help page and understand what it is saying.
>>
>> THanks very much.
>>
>> By the way, to me it is counter-intuitive that the the command is
>>
>>> df1[order(df1[,2],decreasing=TRUE),]
>> For some reason I keep expecting it to be
>> order( , df1[,2],decreasing=TRUE)
>>
>> So clearly I don't understand what is going on but at least I a lot
>> better off. I may be able to get this graph to work.
>
> John,
>
> Perhaps it may be helpful to understand that order() does not actually
> sort() the data.
>
> It returns a vector of indices into the data, where those indices are the
> sorted ordering of the elements in the vector, or in this case, the
> column.
>
> So you want the output of order() to be used within the brackets for the
> row *indices*, to reflect the ordering of the column (or columns in the
> case of a multi-level sort) that you wish to use to sort the data frame
> rows.
>
> set.seed(1)
> x<- sample(10)
>
>> x
> [1] 3 4 5 7 2 8 9 6 10 1
>
>
> # sort() actually returns the sorted data
>> sort(x)
> [1] 1 2 3 4 5 6 7 8 9 10
>
>
> # order() returns the indices of 'x' in sorted order
>> order(x)
> [1] 10 5 1 2 3 8 4 6 7 9
>
>
> # This does the same thing as sort()
>> x[order(x)]
> [1] 1 2 3 4 5 6 7 8 9 10
>
>
> set.seed(1)
> df1<- data.frame(aa = letters[1:10], bb = rnorm(10))
>
>> df1
> aa bb
> 1 a -0.6264538
> 2 b 0.1836433
> 3 c -0.8356286
> 4 d 1.5952808
> 5 e 0.3295078
> 6 f -0.8204684
> 7 g 0.4874291
> 8 h 0.7383247
> 9 i 0.5757814
> 10 j -0.3053884
>
>
> # These are the indices of df1$bb in sorted order
>> order(df1$bb)
> [1] 3 6 1 10 2 5 7 9 8 4
>
>
> # Get df1$bb in increasing order
>> df1$bb[order(df1$bb)]
> [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884 0.1836433 0.3295078
> [7] 0.4874291 0.5757814 0.7383247 1.5952808
>
>
> # Same thing as above
>> sort(df1$bb)
> [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884 0.1836433 0.3295078
> [7] 0.4874291 0.5757814 0.7383247 1.5952808
>
>
> You can't use the output of sort() to sort the data frame rows, so you
> need to use order() to get the ordered indices and then use that to
> extract the data frame rows in the sort order that you desire:
>
>> df1[order(df1$bb), ]
> aa bb
> 3 c -0.8356286
> 6 f -0.8204684
> 1 a -0.6264538
> 10 j -0.3053884
> 2 b 0.1836433
> 5 e 0.3295078
> 7 g 0.4874291
> 9 i 0.5757814
> 8 h 0.7383247
> 4 d 1.5952808
>
>
>> df1[order(df1$bb, decreasing = TRUE), ]
> aa bb
> 4 d 1.5952808
> 8 h 0.7383247
> 9 i 0.5757814
> 7 g 0.4874291
> 5 e 0.3295078
> 2 b 0.1836433
> 10 j -0.3053884
> 1 a -0.6264538
> 6 f -0.8204684
> 3 c -0.8356286
>
>
> Does that help?
>
> Regards,
>
> Marc Schwartz
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra_at_uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php



______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Thu 12 May 2011 - 15:27:16 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 12 May 2011 - 16:00:06 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive