Re: [R] Umlaut read from csv-file

From: Heinz Tuechler <>
Date: Sat, 08 Nov 2008 02:42:49 +0100

At 16:52 07.11.2008, Prof Brian Ripley wrote:
>On Fri, 7 Nov 2008, Peter Dalgaard wrote:
>>Heinz Tuechler wrote:
>>>Dear Prof.Ripley!
>>>Thank you very much for your attention. In the given example Encoding(),
>>>or the encoding parameter of read.csv solve the problem. I hope your
>>>patch will solve also the problem, when I read a spss file by
>>>spss.get(), since this function has no encoding parameter and my real
>>>problem originated there.
>>read.spss() (package foreign) does have a reencode argument, though; and
>>this is called by spss.get(), so it looks like an easy hack to add it
>Yes, older software like spss.get needs to get
>updated for the internationalization
>age. Modifying it to have a ... argument passed
>to read.spss would be a good idea (and future-proofing).
>In cases like this it is likely that the SPSS
>file does contain its encoding (although
>sometimes it does not and occasionally it is
>wrong), so it is helpful to make use of the info
>if it is there. However, the default is
>read.spss(reencode=NA) because of the problems
>of assuming that the info is correct when it is not are worse.

The cause, why I tried the example below was to solve the encoding by dumping and then
re-sourcing a data.frame with the encoding parameter set to latin1. As you can see, source(x, encoding='latin1') does not have the effect I expected. Unfortunately I do not have any idea, what I understood wrong regarding the meaning of encoding='latin1'.

Heinz Tchler

us <- c("a", "b", "c", "", "", "")
[1] "unknown" "unknown" "unknown" "latin1" "latin1" "latin1" dump('us', 'us_dump.txt')
source('us_dump.txt', encoding='latin1') us
[1] "a" "b" "c" "" "" ""
[1] "unknown" "unknown" "unknown" "unknown" "unknown" "unknown" unlink('us_dump.txt')

>Brian D. Ripley,
>Professor of Applied Statistics,
>University of Oxford, Tel: +44 1865 272861 (self)
>1 South Parks Road, +44 1865 272866 (PA)
>Oxford OX1 3TG, UK Fax: +44 1865 272595 mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Sat 08 Nov 2008 - 01:52:16 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 08 Nov 2008 - 08:30:23 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive