Re: [R] rank(x,y)?

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Fri 23 Jun 2006 - 07:09:20 EST

Gabor Grothendieck wrote:

> On 6/21/06, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:
>   
>> Peter Dalgaard wrote:
>>     

>>> Duncan Murdoch <murdoch@stats.uwo.ca> writes:
>>>
>>>
>>>
>>>> Suppose I have two columns, x,y. I can use order(x,y) to calculate a
>>>> permutation that puts them into increasing order of x,
>>>> with ties broken by y.
>>>>
>>>> I'd like instead to calculate the rank of each pair under the same
>>>> ordering, but the rank() function doesn't take multiple values
>>>> as input. Is there a simple way to get what I want?
>>>>
>>>> E.g.
>>>>
>>>> > x <- c(1,2,3,4,1,2,3,4)
>>>> > y <- c(1,2,3,1,2,3,1,2)
>>>> > rank(x+y/10)
>>>> [1] 1 3 6 7 2 4 5 8
>>>>
>>>> gives me the answer I want, but only because I know the range of y and
>>>> the size of gaps in the x values. What do I do in general?
>>>>
>>>>
>>> Still not quite general, but in the absence of ties:
>>>
>>>
>>>
>>>> z[order(x,y)]<-1:8
>>>> z
>>>>
>>>>
>>> [1] 1 3 6 7 2 4 5 8
>>>
>>>
>>>
>> Thanks to all who have replied.  Unfortunately for me, ties do exist,
>> and I'd like them to get identical ranks.  John Fox's suggestion would
>> handle ties properly, but I'm worried about rounding error giving
>> spurious ties.
>>
>>     
>
> Try this variant of my prior solution:
>
> (order(order(x,y)) + rev(order(order(rev(x), rev(y)))))/2
>
> Note that no arithmetic is done on the original data, only on
> the output of order, so there should not be any worry about
> rounding -- in fact its sufficiently general that the data
> do not have to be numeric, e.g.
>
>   
>> x <- c("a", "a", "b", "a", "c", "d")
>> y <- c("b", "a", "b", "b", "a", "a")
>> (order(order(x,y)) + rev(order(order(rev(x), rev(y)))))/2
>>     
> [1] 2.5 1.0 4.0 2.5 5.0 6.0
>   

This is a very nice solution, thanks!

So now we have equivalents to ties="average" and "first"; ties="random" would be easy. I wonder if it's worth working out ties="max" and ties="min" and putting in a new function?

Duncan Murdoch

> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jun 23 10:31:20 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 23 Jun 2006 - 12:11:43 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.