[Rd] use of UTF-8 \uxxxx escape sequences in function arguments

From: Thomas Zumbrunn <thomas_at_zumbrunn.name>
Date: Wed, 18 Jan 2012 23:54:43 +0100


While preparing a function that contained non-ASCII characters for inclusion into a package, I replaced all non-ASCII characters with UTF-8 escape sequences (using \uxxxx) in order to make the package portable (and adhere to "R CMD check"). What I didn't expect: when one uses UTF-8 escape sequences in function arguments, one needs to use UTF-8 escape sequences when calling the function, too - even when working in a UTF-8 locale. Is this an intended behaviour?

Here's an example to illustrate the (putative) problem:

   ## function that uses non-ASCII characters in arguments    plain <- function(myarg = c("Basel", "Bern", "Zürich")) {      myarg <- match.arg(myarg)
   }

   ## function that uses UTF-8 escape sequences in arguments    escaped <- function(myarg = c("Basel", "Bern", "Z\u00BCrich")) {      myarg <- match.arg(myarg)
   }

   ## test
   plain("Zürich") ## works
   plain("Z\u00BCrich") ## fails
   escaped("Zürich") ## fails
   escaped("Z\u00BCrich") ## works

Thank you for your help.
Thomas Zumbrunn

> sessionInfo()

R version 2.14.1 (2011-12-22)                                                                                                                                                                                                                
Platform: x86_64-unknown-linux-gnu (64-bit)                                                                                                                                                                                                  
                                                                                                                                                                                                                                             
locale:                                                                                                                                                                                                                                      
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8                                                                                                                                                               
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    
LC_MESSAGES=en_GB.UTF-8                                                                                                                                                           
 [7] LC_PAPER=C                 LC_NAME=C                  LC_ADDRESS=C                                                                                                                                                                      
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C  

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 18 Jan 2012 - 23:08:07 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 18 Jan 2012 - 23:30:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive