Re: [R] factor levels with umlauts

From: Prof Brian Ripley <>
Date: Tue 10 Oct 2006 - 10:55:28 GMT

On Fri, 6 Oct 2006, Christian Bieli wrote:

> Hi all
> I have to generate some test data for import in an sql database. The
> database is meant for web-based data entry in a study taking place in a
> german speaking region, so factor levels of the variables include umlauts.
> The variables in the dataframe t.muster are generated e.g. like this:
> t.muster$screening <- rep("ausgefüllt",50)
> and exported to a .csv file by:
> write.table(t.muster,"MakeMuster041006/MusterDaten.csv",
> col.names=FALSE,row.names=FALSE,na="",sep=";")
> After export the factor level including an umlaut of t.muster$screening
> look like this in the sql-database as well as in an excel spreadsheet:
> ausgefüllt

I think the problem is rather how you imported them. That is the UTF-8 representation of the "ausgefüllt" viewed in a single-byte locale. R on Windows does not handle UTF-8, so something else has done the conversion.


Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code.

Received on Tue Oct 10 20:58:54 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 10 Oct 2006 - 11:30:12 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.