Re: [R] UTF-8 or Unicode on Windows PC

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Mon, 21 Apr 2008 18:09:06 +0100 (BST)

On Mon, 21 Apr 2008, Hans-Joerg Bibiko wrote:

>
> On 21 Apr 2008, at 12:33, Prof Brian Ripley wrote:
>
>>> Is it possible to download a compiled snapshot of 2.7.0 for Windows XP?
>>
>> Yes,
http://cran.r-project.org/bin/windows/base/rtest.html
>> And it is due for release tomorrow.
>
> Many thanks! I can see the progress :)
>
> But please forgive my incompetence. I'm not so familiar with Windows.
> If I start e.g. RGUI by using: Rgui.exe LC_CTYPE=ja I can type Japanese,
> Russian, and German. strsplit works perfectly! ;)
> But if I type for instance a German umlaut '' it comes out as 'u'. OK, it is
> due to the fact I didn't set up Rgui in UTF-8 mode.

Entering at the keyboard in more than one language is close to impossible (not quite, as 'Japanese' covers a few but you need a Japanese keyboard to do it). You can't change the language of Windows just by setting locales.

> But how can I do this? My data are written in many different languages, and I
> want to do some statistics.

You can read in files in known encodings, though.

> R version 2.7.0 RC (2008-04-19 r45391)
> i386-pc-mingw32
>
> locales:
> all to German_Germany.1252
> LC_CTYPE=Japanese_Japan.932
>
> ###
>
> There are some minor issues.
> I set Rgui's font to "Arial Unicode". This works but I have some troubles to
> place my cursor, caused by the issue that Arial Unicode is not a monospaced
> font.

Right, and you are warned not to do that. You must use a fixed-width font, and for CJK characters, one in the standard single/double spacing.

(See for example the comments in Rconsole and rw-FAQ 3.5. The GUI preferrences dialog only offers fixed-width fonts, so you have to work quite hard to do anything else.)

> If I start up Rgui in German, I can see the localized menu items, but for
> each non-ASCII character I see cryptic things. It seems to me that the
> localized strings are written in UTF-8, and Rgui expects ANSI characters.

Argh, yes, that was an error by the translator in marking the file -- thanks, I just have time to fix it. (RGui does not expect ANSI, but all of R does expect translations to be in the encoding they are declared to be-- this eas declared as ISO-8859-1.)

> ###
> Nevertheless, thanks a lot!
>
> --Hans
>
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Mon 21 Apr 2008 - 17:17:21 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 22 Apr 2008 - 16:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive