# Re: [Rd] bug in sum() on integer vector

From: Hervé Pagès <hpages_at_fhcrc.org>
Date: Wed, 14 Dec 2011 17:51:13 -0800

On 11-12-14 08:19 AM, peter dalgaard wrote:
>
> On Dec 14, 2011, at 16:19 , John C Nash wrote:
>
>>
>> Following this thread, I wondered why nobody tried cumsum to see where the integer
>> overflow occurs. On the shorter xx vector in the little script below I get a message:
>>
>> Warning message:
>> Integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
>>>
>>
>> But sum() does not give such a warning, which I believe is the point of contention. Since
>> cumsum() does manage to give such a warning, and show where the overflow occurs, should
>> sum() not be able to do so? For the record, I don't class the non-zero answer as an error
>> in itself. I regard the failure to warn as the issue.
>
> It (sum) does warn if you take the two "halves" separately. The issue is that the overflow is detected at the end of the summation, when the result is to be saved to an integer (which of course happens for all intermediate sums in cumsum)
>
>> x<- c(rep(1800000003L, 10000000), -rep(1200000002L, 15000000))
>> sum(x[1:10000000])
> [1] NA
> Warning message:
> In sum(x[1:1e+07]) : Integer overflow - use sum(as.numeric(.))
>> sum(x[10000001:25000000])
> [1] NA
> Warning message:
> In sum(x[10000001:1.5e+07]) : Integer overflow - use sum(as.numeric(.))
>> sum(x)
> [1] 4996000
>
> There's a pretty easy fix, essentially to move
>
> if(s> INT_MAX || s< R_INT_MIN){
> warningcall(call, _("Integer overflow - use sum(as.numeric(.))"));
> *value = NA_INTEGER;
> }
>
> inside the summation loop. Obviously, there's a speed penalty from two FP comparisons per element, but I wouldn't know whether it matters in practice for anyone.
>

```     if (warn && (s > INT_MAX || s < R_INT_MIN)) {
generate the warning
warn = 0;
}

```

with 'warn' initialized to 1. This makes the isum() function almost twice slower on my machine (64-bit Ubuntu) when compiling with gcc -O2 and when no overflow occurs (the most common use case I guess).

Cheers,
H.

```--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages_at_fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
```
Received on Thu 15 Dec 2011 - 01:53:31 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 15 Dec 2011 - 12:50:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.