[R] Umlaut read from csv-file

From: Heinz Tuechler <tuechler_at_gmx.at>
Date: Thu, 06 Nov 2008 21:39:34 +0100


Dear All!

Reading character strings containing an "umlaut" from a csv-file I find a (to me) surprising behaviour in R 2.8.0, that I did not notice in R 2.7.2. A comparison by "==" results in FALSE, while grep does find the aggreement. See the example below.
The crucial line is x=="div 1-2 Veränderungen", with the result [1] FALSE in R 2.8.0 but [1] TRUE in R 2.7.2.

Thank you in advance for your help

Heinz Tüchler

##### in R 2.8.0 patched

x0 <- "div 1-2 Veränderungen" # define a character string

write.csv(x0, 'chr.csv', row.names=FALSE) # write a csv-file with one line rm(x0)

x <- read.csv('chr.csv', skip=0, header=TRUE, as.is=TRUE)$x # read in csv-file x
x=="div 1-2 Veränderungen"
> [1] FALSE

grep("div 1-2 Veränderungen", x)
> [1] 1

grep("div 1-2 Veränderungen", x, value=TRUE)
> [1] "div 1-2 Veränderungen"

unlink('chr.csv') # delete file

Version:
  platform = i386-pc-mingw32
  arch = i386
  os = mingw32
  system = i386, mingw32
  status = Patched
  major = 2
  minor = 8.0
  year = 2008
  month = 11
  day = 04
  svn rev = 46830
  language = R
  version.string = R version 2.8.0 Patched (2008-11-04 r46830)

Windows XP (build 2600) Service Pack 2

Locale:
LC_COLLATE=German_Austria.1252;LC_CTYPE=German_Austria.1252;LC_MONETARY=German_Austria.1252;LC_NUMERIC=C;LC_TIME=German_Austria.1252

Search Path:
  .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils,
package:datasets, package:methods, Autoloads, package:base

##### in R 2.7.2 patched

x0 <- "div 1-2 Veränderungen" # define a character string

write.csv(x0, 'chr.csv', row.names=FALSE) # write a csv-file with one line rm(x0)

x <- read.csv('chr.csv', skip=0, header=TRUE, as.is=TRUE)$x # read in csv-file x
x=="div 1-2 Veränderungen"
> [1] TRUE

grep("div 1-2 Veränderungen", x)
> [1] 1

grep("div 1-2 Veränderungen", x, value=TRUE)
> [1] "div 1-2 Veränderungen"

unlink('chr.csv') # delete file

Version:
  platform = i386-pc-mingw32
  arch = i386
  os = mingw32
  system = i386, mingw32
  status = Patched
  major = 2
  minor = 7.2
  year = 2008
  month = 09
  day = 02
  svn rev = 46486
  language = R
  version.string = R version 2.7.2 Patched (2008-09-02 r46486)

Windows XP (build 2600) Service Pack 2

Locale:
LC_COLLATE=German_Austria.1252;LC_CTYPE=German_Austria.1252;LC_MONETARY=German_Austria.1252;LC_NUMERIC=C;LC_TIME=German_Austria.1252

Search Path:
  .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils,
package:datasets, package:methods, Autoloads, package:base



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 06 Nov 2008 - 20:44:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 06 Nov 2008 - 23:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive