Re: R-alpha: seq

Kurt Hornik (Kurt.Hornik@ci.tuwien.ac.at)
Wed, 4 Dec 1996 13:09:09 +0100


Date: Wed, 4 Dec 1996 13:09:09 +0100
Message-Id: <199612041209.NAA24470@aragorn.ci.tuwien.ac.at>
From: Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at>
To: Peter Dalgaard BSA <pd@kubism.ku.dk>
Subject: Re: R-alpha: seq
In-Reply-To: <x2ybfe7fed.fsf@bush.kubism.ku.dk>
 <199612040917.KAA21871@aragorn.ci.tuwien.ac.at>

>>>>> On 04 Dec 1996 11:49:46 +0100,
>>>>> Peter Dalgaard BSA <pd@kubism.ku.dk> said:

> Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at> writes:
>> 
>> >>>>> On Wed, 4 Dec 1996 09:37:30 +0100 (MET),
>> >>>>> Jim Lindsey <jlindsey@luc.ac.be> said:
>> 
>> > 1. seq does not all give what is expected, apparently due to rounding.
>> >   Something like seq(0.1,0.9,by=0.01) may stop at 0.9 or 0.89
>> >   depending on what the end value is (i.e. if 0.9 is replace by 0.5)
>> >   Sorry I forgot to note exactly what values workor not.
>> 
>> I can confirm that (Debian GNU/Linux ELF):
>> 
R> seq(0.1,0.2,by=0.01)
>> [1] 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20
R> seq(0.1,0.3,by=0.01)
>> [1] 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 0.21 0.22 0.23 0.24
>> [16] 0.25 0.26 0.27 0.28 0.29
>> 

> This is of course quite logical. Never trust equality with floating
> point. Splus doesn't seem to do it, though. In fact:

>> seq(0.1,0.3,by=0.01000001)
>  [1] 0.1000000 0.1100000 0.1200000 0.1300000 0.1400000 0.1500000 0.1600001
>  [8] 0.1700001 0.1800001 0.1900001 0.2000001 0.2100001 0.2200001 0.2300001
> [15] 0.2400001 0.2500001 0.2600002 0.2700002 0.2800002 0.2900002
>> seq(0.1,0.3,by=0.010000001)
>  [1] 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 0.21 0.22 0.23 0.24
> [16] 0.25 0.26 0.27 0.28 0.29 0.30
>> seq(0.1,0.3,by=0.010000001)[21]-0.3
> [1] 2e-08

> So it must work with a slight fuzz factor inside, which is probably a
> good thing. Is the pragmatic convention to declare numbers "equal" if
> they print out identically?

Actually, several different things are happening here.

seq(from,to,by) does

	n <- (to - from)/by
	from + (0:n) * by

Now look at

R> ((0.3 - 0.1) / 0.01)     
[1] 20
R> 0 : ((0.3 - 0.1) / 0.01)
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19

BUT

R> 0 : (0.2 / 0.01)
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Of course, we also find

R> as.integer (0.2 / 0.01)
[1] 20
R> as.integer ((0.3 - 0.1) / 0.01)
[1] 19

and similar funny things using %% and %/%.  To make matters worse, we
get

R> as.integer ((0.21 - 0.01) / 0.01)
[1] 19
R> as.integer ((0.22 - 0.02) / 0.01)
[1] 20

etc.

Btw, the error is

R> (0.3 - 0.1) / 0.01 - 20
[1] -3.552714e-15

so really an obvious idea might be to use some tolerance parameter for
rounding (to - from)/by, or do the rounding insided do_seq() in
main/seq.c by replacing asReal() ...

As an aside, I just recompiled R with -ffloat-store in CFLAGS to see if
this helps.  For the above problem, it does not.   HOWEVER, compare

R> .Machine$double.eps
[1] 2.220446e-16
R> .Machine$double.max
[1] 1024

for the version with -ffloat-store with

R> .Machine$double.eps
[1] 1.084202e-19
R> .Machine$double.max
[1] -2

(wow, no positive numbers on my machine?) for the version without.

Hence, I strongly recommend to ALWAYS add -ffloat-store to CFLAGS (or
remove any -Ox) when compiling on Linux/Intel.

-k
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-testers mailing list -- For info or help, send "info" or "help",
To [un]subscribe, send "[un]subscribe"
(in the "body", not the subject !)  To: r-testers-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-