Re: [R] umlauts in Rd files

From: Prof Brian Ripley <>
Date: Wed 15 Jun 2005 - 23:30:51 EST

On Wed, 15 Jun 2005, Peter Dalgaard wrote:

> Robin Hankin <> writes:
>> Hi
>> I'm having difficulty following the advice in section 2.7 of R-exts.
>> In one of my packages, there is a function called mobius().
>> I want to refer to it in the Rd file as the Möbius function, and to
>> illustrate the
>> Möbius inversion formula (just to be explicit: this is "Mobius" but
>> with two dots over the second letter).
>> R-exts section 2.7 gives
>> \enc{Jöreskog}{Joreskog}
>> as an example, but when I cut-and-paste this, the dvi file (as produced
>> by R CMD Rd2dvi)
>> shows the umlauted "o" as A and Z with some diacritical marks, not the
>> desired o with
>> two dots on.
>> Using \"{o} is fine for the dvi output but not the ascii output.
>> How do I put an umlauted "o" in an Rd file in such a way as to have a
>> nice
>> ascii help page and nice dvi files?
> Well... You can't. There's no odiaeresis in ASCII. That's exactly the
> problem. In UTF-8 or ISO-Latin-1/9 (aka 8859-1 or ditto with the
> addition of the Euro) you can display the character and we did
> previously implicitly assume Latin-1. However this is of no use to
> people in say Latin-2 locales, and in fact we can no longer spell the
> entire R Core Team correctly using any of the Latin-N locales (we
> lose either M{\"a}chler or {\v S}imon).
> As far as I understand the current situation, we recommend that text
> files be pure ASCII (which has also led us to introduce deliberate
> misspellings of various people in the NEWS file and similar places).
> What is happening to you is something else though: The double
> characters are a tell-tale sign that you have provided UTF-8 to
> something that expected an 8-bit encoding like Latin-1. The fix for
> that should be to put \encoding{UTF-8} somewhere at the beginning of
> the .Rd file.
> (I may well have gotten some detail wrong here, Brian probably knows
> the best.)

UTF-8 for latex does not work well (as yet, at least: there is now a utf8 encoding that allows at least the first plane (Latin-1) to work). So it would be much better to use Latin-1 for the file and mark it with \encoding{latin1} and mark specifically with \enc{Möbius}{Mobius} or your preferred transliteration.

The problem is not really for Latin-2 (which does have a and o diaeresis), but languages such as Japanese and Chinese which only have ASCII. So the transliteration is for people without any accents in their charset.

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list PLEASE do read the posting guide!

Received on Wed Jun 15 23:33:41 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:41 EST