Subject: | sum, min, and max use NV, truncating integers on 64-bit machines |
The sum, min, and max routines use an NV type for operations on non-objects. For platforms with 64-bit UV and 64-bit NV (very common), this results in lost data, e.g.:
use List::Util "sum";
my $n = int(2**53);
say $n+13;
say int(sum($n,13));
There are more examples, e.g.
use List::Util "sum";
my $n = ~0 - 3000;
say $n;
say $n+1000;
say int(sum($n,1000));
say int(sum($n,1000,1000,1000,1000,1000));
say int(sum($n, (1000) x 20000 ));
noting that in this example we can add 1000 as many times as we like and the result is unchanged.
Notes:
- With 32-bit UV and 64-bit NV we can't get in trouble.
- With Perl 5.6.2 we don't see it because Perl 5.6.2 is a horrible mess with 64-bit -- it internally converts things to NVs all over the place. Thankfully fixed in 5.8+.
- Compiling with long double on gcc and x86_64, as long doubles on this platform have 64-bit mantissas. Other compilers and architectures differ, and most people don't compile with long doubles.
- Don't sum large numbers or make big sums, and things are fine. I suspect this covers most people's use.
- using bigint makes it work fine, at the expense of using bigint (super slow math).
A similar issue was brought up a couple years ago in https://rt.cpan.org/Ticket/Display.html?id=77457. Kevin and I both do some number theory, which is where getting inexact integer results causes havoc.
The same issue was seen in List::MoreUtils in https://rt.cpan.org/Ticket/Display.html?id=93207
The latter has an example of List::Util's min and max getting things wrong. Note that it was fixed in List::MoreUtils, so perhaps this will contain useful ideas.
Opinion: min and max aren't so bad to fix, since we return the SV* from the stack. We need to do more precise comparisons. sum is more problematic since we have to compute the running sum and change types if either we get a new type in input (e.g. sum(10,20,1.6)) or if we overflow. Also watch out for UV vs. IV.