RE: [Rd] Inaccuracy in seq() function (PR#7503)

From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>
Date: Thu 13 Jan 2005 - 07:52:26 EST


On 12-Jan-05 vstolin@lordabbett.com wrote:

> [...]
> When generating the sequence using seq() function with
> non-integer numbers result is somewhat unpredictable.
> Example:
>> v1<-seq(1.60,1.90,.05)
>> v2<-c(1.60,1.65,1.70,1.75,1.80,1.85,1.90)
>> v1-v2
> [1] 0.000000e+00 2.220446e-16 2.220446e-16 0.000000e+00 0.000000e+00
> 0.000000e+00 2.220446e-16
>> v1==v2
> [1]  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE
>> v1>1.70 & v2<1.80
> [1] FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE

This sort of thing is inevitable as a consequence of the fact that most decimal fractions are not stored accurately in a finite binary representation. It's not a bug in seq().

Example:

  1.65-(1.60+0.05)
[1] -2.220446e-16

(compare your second term in v1-v2).

Personally I always prefer to use scaled integer sequences when creating fractional sequences:

  v3 <- seq(160,190,5)/100

  v1-v3
[1] 0.000000e+00 2.220446e-16 2.220446e-16 0.000000e+00
[5] 0.000000e+00 0.000000e+00 2.220446e-16

  v3-v2
[1] 0 0 0 0 0 0 0

so my method agrees with your hand-forced v2.

But be watchful:

  v4 <- seq(160,190,5)*0.01
  v3-v4
[1] 0.000000e+00 -2.220446e-16 0.000000e+00 0.000000e+00
[5] 0.000000e+00 0.000000e+00 -2.220446e-16

so "/100" is not the same as "*0.01"! And you can similarly check that not only is v4 != v3, but neither is v4 != v1 nor is v4 != v2. So now we have three different results (v1, v2 == v3, v4).

Since (1.60+0.05) corresponds to the method used by seq(), and is not equal to 1.65, you might argue that seq() is wrong, but since the method used by seq() consists of adding the "by" the result of seq() is correct -- in its own terms!

If you know that your intended sequence should only have a fixed number of decimal places (in your case 2), you can force all methods to give the same result by using round():

  v1a <- round(seq(1.60,1.90,0.05),2)
  v2a <- round(c(1.60,1.65,1.70,1.75,1.80,1.85,1.90),2)
  v3a <- round(seq(160,190,5)/100,2)
  v4a <- round(seq(160,190,5)*0.01,2)

  v1a-v2a
[1] 0 0 0 0 0 0 0

  v1a-v3a
[1] 0 0 0 0 0 0 0

  v1a-v4a
[1] 0 0 0 0 0 0 0

so now they are indeed all equal. Also note that

  v2-v2a
[1] 0 0 0 0 0 0 0

  v3-v3a
[1] 0 0 0 0 0 0 0

so my "seq(160.190.5)/100" gives the same result as the rounded result, and as your hand-forced version. This is why I prefer it!

Best wishes,
Ted.



E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 [NB: New number!]
Date: 12-Jan-05                                       Time: 20:52:26
------------------------------ XFMail ------------------------------

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu Jan 13 07:11:35 2005

This archive was generated by hypermail 2.1.8 : Thu 13 Jan 2005 - 07:20:24 EST