From: John C Nash <nashjc_at_uottawa.ca>

Date: Wed, 14 Dec 2011 16:16:31 -0500

*>>
*

>> On Dec 14, 2011, at 16:19 , John C Nash wrote:

*>>
*

*>>>
*

*>>> Following this thread, I wondered why nobody tried cumsum to see where the integer
*

*>>> overflow occurs. On the shorter xx vector in the little script below I get a message:
*

*>>>
*

*>>> Warning message:
*

*>>> Integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
*

*>>>>
*

*>>>
*

*>>> But sum() does not give such a warning, which I believe is the point of contention. Since
*

*>>> cumsum() does manage to give such a warning, and show where the overflow occurs, should
*

*>>> sum() not be able to do so? For the record, I don't class the non-zero answer as an error
*

*>>> in itself. I regard the failure to warn as the issue.
*

*>>
*

*>> It (sum) does warn if you take the two "halves" separately. The issue is that the
*

*>> overflow is detected at the end of the summation, when the result is to be saved to an
*

*>> integer (which of course happens for all intermediate sums in cumsum)
*

*>>
*

*>>> x<- c(rep(1800000003L, 10000000), -rep(1200000002L, 15000000))
*

*>>> sum(x[1:10000000])
*

*>> [1] NA
*

*>> Warning message:
*

*>> In sum(x[1:1e+07]) : Integer overflow - use sum(as.numeric(.))
*

*>>> sum(x[10000001:25000000])
*

*>> [1] NA
*

*>> Warning message:
*

*>> In sum(x[10000001:1.5e+07]) : Integer overflow - use sum(as.numeric(.))
*

*>>> sum(x)
*

*>> [1] 4996000
*

*>>
*

*>> There's a pretty easy fix, essentially to move
*

*>>
*

*>> if(s> INT_MAX || s< R_INT_MIN){
*

*>> warningcall(call, _("Integer overflow - use sum(as.numeric(.))"));
*

*>> *value = NA_INTEGER;
*

*>> }
*

*>>
*

*>> inside the summation loop. Obviously, there's a speed penalty from two FP comparisons
*

*>> per element, but I wouldn't know whether it matters in practice for anyone.
*

*>>
*

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 14 Dec 2011 - 21:32:06 GMT

Date: Wed, 14 Dec 2011 16:16:31 -0500

I agree that where the overflow occurs is not critical (one can go back to cumsum and find out). I am assuming that Uwe still wants to know there has been an overflow at some point i.e., a warning. This could become more "interesting" as parallel computation causes different summation orderings on sums of large numbers of items.

JN

On 12/14/2011 03:58 PM, Uwe Ligges wrote:

> > > On 14.12.2011 17:19, peter dalgaard wrote:

>> On Dec 14, 2011, at 16:19 , John C Nash wrote:

> > > I don't think I am interested in where the overflow happens if I call sum()... > > Uwe ______________________________________________R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 14 Dec 2011 - 21:32:06 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

*
Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.
Archive generated by hypermail 2.2.0, at Wed 14 Dec 2011 - 22:00:17 GMT.
*

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*