[Rd] as.numeric and as.character with locale using comma as separator

From: Claudia Beleites <claudia.beleites_at_ipht-jena.de>
Date: Tue, 14 Aug 2012 19:00:11 +0200


Dear all,

summary:

My LC_NUMERIC is changed from C to de_DE by library (qtbase).
[which shouldn't happen according to the warning when setting it back
manually].
I posted an issue at their github repository, but maybe the behaviour is a bit more of general interest.

However, if LC_NUMERIC is changed, as.character () uses the decimal separator that belongs to LC_NUMERIC (and not options ()$OutDec as I supposed).
as.double () (= as.numeric ()) doesn't, though.

That causes trouble with constructs like as.numeric (as.character (x))

long version:

as.character seems to take into account my locale (de_DE) which uses comma as decimal separator:

> x <- rnorm (3)
> x
[1] -0,004238328 -0,919358537 -1,654543297
> as.character(x)
[1] "-0,00423832753479965" "-0,919358536523751" "-1,65454329680873"

whereas as.numeric () doesn't:

> as.numeric (as.character(x))
[1] NA NA NA

Warnmeldung:
NAs durch Umwandlung erzeugt

> as.numeric (gsub (",", ".", as.character(x)))
[1] -0,004238328 -0,919358537 -1,654543297

I did not see any mention in the help of as.numeric nor as.character of this.

Note also the output of example (as.character): > example (as.character)

as.chr> form <- y ~ a + b + c

as.chr> as.character(form) ## length 3
[1] "~" "y" "a + b + c"

as.chr> deparse(form) ## like the input
[1] "y ~ a + b + c"

as.chr> a0 <- 11/999 # has a repeating decimal representation

as.chr> (a1 <- as.character(a0))
[1] "0,011011011011011"

as.chr> format(a0, digits=16) # shows one more digit
[1] "0,01101101101101101"

as.chr> a2 <- as.numeric(a1)

as.chr> a2 - a0 # normally around -1e-17
[1] NA

as.chr> as.character(a2) # normally different from a1
[1] NA

as.chr> print(c(a0, a2), digits = 16)
[1] 0,01101101101101101 NA
Warnmeldung:
In eval(expr, envir, enclos) : NAs durch Umwandlung erzeugt

*session info*
> sessionInfo ()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] de_DE.UTF-8

attached base packages:
[1] splines stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] Hmisc_3.9-3 survival_2.36-14 plumbr_0.6.6 cranvas_0.8
[5] maps_2.2-6 scales_0.2.1 qtpaint_0.9.0 qtbase_1.0.5
[9] idendro_1.0

loaded via a namespace (and not attached):

 [1] cluster_1.14.2         colorspace_1.1-1       dichromat_1.2-4
 [4] grid_2.15.1            labeling_0.1           lattice_0.20-6
 [7] munsell_0.3            objectProperties_0.6.5 objectSignals_0.10.2

[10] plyr_1.7.1 RColorBrewer_1.0-5 SearchTrees_0.5.1
[13] stringr_0.6 tools_2.15.1 tourr_0.5.2

Note that

> options ()$OutDec
[1] "."

In fresh R sessions I have

locale:

 [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=de_DE.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

It seems qtbase is the culprit:

> x
[1] -0.2290188 -0.1884703 0.2507179

> library (qtbase)
> x
[1] -0,2290188 -0,1884703 0,2507179

After setting the numeric locale back to C: > Sys.setlocale ("LC_NUMERIC", "C")
[1] "C"

Warnmeldung:
In Sys.setlocale("LC_NUMERIC", "C") :
  das Setzen von 'LC_NUMERIC' kann bewirken, dass R sich komisch benimmt

as.numeric (as.character (x)) works as supposed (also output has decimal points again)

Best,

Claudia

-- 
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.beleites_at_ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue 14 Aug 2012 - 17:08:11 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 15 Aug 2012 - 02:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive