[Rd] locales and readLines

From: Martin Morgan <mtmorgan_at_fhcrc.org>
Date: Fri, 31 Aug 2007 09:30:43 -0700


I'm looking for some 'best practices', or perhaps an upstream solution (I have a deja vu about this, so sorry if it's already been asked). Problems occur when a file is encoded as latin1, but the user has a UTF-8 locale (or I guess more generally when the input locale does not match R's). Here are two examples from the Bioconductor help list:


(the relevant command is library(GEOquery); gse <- getGEO('GSE94'))


I think solutions are:

Unfortunately, these (1 & 2, anyway) place extra burden on the package author, to become educated about locales, the encoding conventions of the files they read, and to know how R deals with encodings.

Are there other / better solutions? Any chance for some (additional) 'smarts' when reading files?


Martin Morgan
Bioconductor / Computational Biology

R-devel_at_r-project.org mailing list
Received on Fri 31 Aug 2007 - 16:43:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 03 Sep 2007 - 12:40:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.