Re: [Rd] R 2.5.0 refuses to print enough digits to recover exact floating point values

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed, 23 May 2007 18:32:36 +0100 (BST)

I think this is a bug in the MacOS X runtime. I've checked the C99 standard, and can see no limits on the precision that you should be able to specify to printf.

Not that some protection against such OSes would come amiss.

However, the C99 standard does make clear that [sf]printf is not required (even as 'recommended practice') to be accurate to more than *_DIG places, which as ?as.character has pointed out is 15 on the machines R runs on.

It really is the case that writing out a number to > 15 significant digits and reading it back in again is not required to give exactly the same IEC60559 (sic) number, and furthermore there are R platforms for which this does not happen. What Mr Weinberg claimed is 'now impossible' never was possible in general (and he seems to be berating the R developers for not striving to do better than the C standard requires of OSes). In fact, I believe this to be impossible unless you have access to extended precsion arithmetic, as otherwise printf/scanf have to use the same arithmetic as the computations.

This is why R supports transfer of floating-point numbers via readBin and friends, and uses a binary format itself for save() (by default).

I should also say that any algorithm that depends on that level of details in the numbers will depend on the compiler used and optimization level and so on. Don't expect repeatability to that level even with binary data unless you (are allowed to) never apply bugfixes to your system.

On Wed, 23 May 2007, hadley wickham wrote:

> On 5/23/07, hadley wickham <h.wickham_at_gmail.com> wrote:

>> On 5/22/07, Uwe Ligges <ligges@statistik.uni-dortmund.de> wrote:
>>>
>>>
>>> Zack Weinberg wrote:
>>>> I have noticed that in R 2.5.0, no method of textual output will print
>>>> a "double" mode quantity with more than 15 digits after the decimal
>>>> point. From the help page (?print.default) it appears that this is
>>>> intentional, since digits after the fifteenth may be uncertain.
>>>> However, fifteen digits after the decimal point are not enough to
>>>> represent all the values that an IEEE-double can take. (You need one
>>>> more.) This means it is now impossible to write out data in textual
>>>> format (e.g. in order to manipulate it with another program) and read
>>>> back in exactly the same values. Some analyses are sensitive to this
>>>> sort of extra rounding, especially if it happens repeatedly.
>>>>
>>>> I'd really appreciate some way of forcing R to print enough digits to
>>>> represent every possible IEEE double value. I would also argue that
>>>> this should be the default behavior of dump(), write.table() and
>>>> friends, and save(...,ascii=TRUE), to prevent data loss.
>>>
>>> Example:
>>>
>>> formatC(exp(1), digits=100, width=-1)
>>
>> formatC(exp(1), digits=1000000, width=-1)
>> *** caught bus error ***
>> address 0x2, cause 'non-existent physical address'
>>
>> R version 2.5.0 (2007-04-23)
>> i386-apple-darwin8.9.1
>
> Ooops, and the traceback:
>
> Traceback:
> 1: .C("str_signif", x = x, n = n, mode = as.character(mode), width =
> as.integer(width),     digits = as.integer(digits), format =
> as.character(format),     flag = as.character(flag), result =
> blank.chars(i.strlen),     PACKAGE = "base")
> 2: formatC(exp(1), digits = 1e+06, width = -1)
>
> Hadley
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 23 May 2007 - 17:49:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 26 Jun 2007 - 16:35:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.