Re: [Rd] seq() function accuracy inacceptable

From: Thomas Lumley <tlumley_at_u.washington.edu>
Date: Tue 18 Apr 2006 - 18:04:18 GMT


> The seq-command produces unnescessary inaccurate results, which can be extremely
> annoying. I absolutely do not see the nescessity of numerical garbage
> to appear in the following simple case. E.g. try this:
> > seq ( 61.55 , 62.00 , by=0.01 ) - round ( seq ( 61.55 , 62.00 , by=0.01 ) ,
> digits=2 )

An even simpler case may help explain why this is not *unnecessary* inaccuracy.

Consider the three expressions

   2+0.01+0.01
   2+0.01*2
   2.02

These need not give the same answer. As it happens, on my computer 2.02 and 2+0.01*2 are the same, but they differ by the smallest representable amount from 2+0.01+0.01. All three could be different in other examples.

Since you think the correct output of seq() is easy to determine, which of these should be equal to the third element of seq(2, 3, by=0.01)?

By the way, seq() is an interesting example, because the code goes to some effort to do the sort of thing you want. It is designed to give less accurate answers so as to be consistent with naive expectations when 'to'-'from' is close to a multiple of 'by'. This doesn't affect your example, but if you had used seq(61.56,62,by=0.01) you would have benefitted from the fact that, although (62-61.56)/0.01 is very slightly less than 44, seq() still includes the 44th step. In general, though, R is better off using as much accuracy as possible for a given computation rather than trying to guess what a user will want to use it for.

         -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley@u.washington.edu	University of Washington, Seattle

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Apr 19 04:06:22 2006

This archive was generated by hypermail 2.1.8 : Tue 18 Apr 2006 - 22:17:50 GMT