Re: [Rd] bug in rank(), order(), is.unsorted() on character vector

From: Hervé Pagès <>
Date: Thu, 08 Dec 2011 10:26:48 -0800

Hi Barry,

Hope you don't mind if I put this back on the list.

On 11-12-08 05:50 AM, Barry Rowlingson wrote:
> 2011/12/8 Hervé Pagès<>:
>> A naive question: wouldn't everything be simpler if LC_COLLATE=C
>> was the default for everybody?
> Yet when we Brits suggest everything would be simpler if the whole
> world spoke the Queen's English it causes all sorts of trouble...

:-) Sure I see your point.

But it's a programming language here, used by a lot of researchers. And having the result of an analysis depend on a crazy collate is causing all sorts of troubles too.

Note that trying to strike back the Empire is a lost battle anyway. When you use R (as a user or a developer), any function name you type (sort, rank, print, summary, etc...) is in Queen's English. And their man pages too.

Also note that I was just talking about the *default*. AFAIK other very serious projects like Python or SQLite *by default* use a collating sequence that behaves like LC_COLLATE=C on strings that contain ASCII chars only. And they let you change that if you want. Are they being imperialist? Most R users/developers are in research or academics where I suspect consistency and reproducibility is even a bigger deal than in the Python or SQLite community.


> Barry

Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________ mailing list
Received on Thu 08 Dec 2011 - 18:29:13 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 08 Dec 2011 - 23:00:16 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive