Bug #29210 for Math-Currency: as_int() method can return incorrect value

Thu Sep 06 12:20:58 2007 RYBSKEJ [...] cpan.org - Ticket created

Subject:

as_int() method can return incorrect value

Since as_int() method uses multiplication to express the integer part of the number, the CPU must translate this to a natural floating point number and perform the operation. Unfortunately, this introduces floating point approximation errors, leading to as_int amounts that report *incorrectly* compared to their actual value as stored in Math::Currency class. I have attached a script that shows such error. This script reported numerous as_int() floating point errors on 64-bit Perl on Solaris 9. (My Perl configuration information is attached to this report). I found that in order to solve this, I ended up having to greatly increase the floating point accuracy of the mathematical result depending on the current value of the the number in the class, and then shift off the approximation errors. For example, as a functional patch for USD currency (2-decimal accuracy), I modified my local Math::Currency class to perform an inverse logarithmetic adjustment (dependent on the size of the internal integer number) as follows: sub as_int { #with extra step to force 1 extra (internal) level of #fractional precision my $self = shift; my $prec_adj = int(7 + (4 / (log($MBI->_len( $self->{_m} )) || 1))); $prec_adj = $prec_adj > 10 ? 10 : $prec_adj < 2 ? 2 : $prec_adj; return "0" unless $MBI->_len( $self->{_m} ); my $num_str = Math::BigInt->new( $self->as_float * 10**($self->format()->{FRAC_DIGITS} + $self->{_prec_adj}) )->bstr; return length($num_str) > 1 ? substr($num_str,0, -1 * $self->{_prec_adj}) : $num_str; } The mathematical adjustment choice was based on a relatively rough approximation of the relative level of floating point approximation error for 64-bit double values, based on the size of the integer and decimal part together. I then used truncation to remove the error-prone decimal part section back to the original value decimal place accuracy. Using brute-force testing, I have verified that this gives 100% accurate amount results up to $100,000,000.00 (2-decimal place values, 64-bit doubles). It's highly likely that this same method would work fine for 3-4 decimal place values. I recommend additional research be done determine the best level of floating point accuracy required for portability with 32 and 64 bit double floats, but for the time being, I strongly recommend integration of a general solution like the one I presented be done to guarantee that as_int() method returns 100% accurate values. -Eric Rybski (rybskej@yahoo.com)

Subject:

math_currency_as_int_fp_err.pl

#!/usr/bin/perl use sigtrap qw(die untrapped normal-signals stack-trace error-signals); use Math::Currency; my $start = shift @ARGV || 0; my $dollar_range = shift @ARGV || 10000000; my ($iter, $curr_err_cnt, $curr2_err_cnt) = (0, 0, 0); my $time = time(); warn "Test range: $start..$dollar_range"; for my $dollar ($start..$dollar_range) { for my $cent (0..99) { my $amt = "$dollar.".($cent < 10 ? "0$cent" : $cent); my $int = ($dollar ? $dollar : "") .($dollar ? $cent < 10 ? "0$cent" : $cent : $cent); my $curr = new Math::Currency $amt; my $curr_int = $curr->as_int; if ($curr_int ne $int) { warn " Math::Currency $curr(".$curr_int.") ne $int\n"; $curr_err_cnt++; } warn "processed $amt so far...\n" if $dollar % 1000 == 0 && $cent % 100 == 0; $iter++; } } END { my $dr = new Math::Currency $dollar_range; print "Done! (took ".(time() - $time)." seconds)\n"; print "From \$0 to $dr ($iter iterations)...\n"; warn "\tFound $curr_err_cnt Math::Currency fp errors\n"; }

Subject:

perl5_sol9_config.txt

Summary of my perl5 (revision 5 version 8 subversion 7) configuration: Platform: osname=solaris, osvers=2.9, archname=sun4-solaris-64 uname='sunos usadev01 5.9 generic_117171-17 sun4u sparc sunw,sun-fire-v210 ' config_args='-Dcc=gcc -Doptimize=-O2 -Duse_large_files -Duse64bitall -Dusemorebits -Accflags=-mcpu=v9 -m64 -I/usr/local/include/v9 -I/usr/local/include -Aldflags=-mcpu=v9 -m64 -L/usr/lib/sparcv9 -L/usr/local/lib/sparcv9 -L/usr/local/BerkeleyDB64/lib -L/usr/lib -L/usr/local/lib -R/usr/lib/sparcv9 -R/usr/local/lib/sparcv9 -R/usr/local/BerkeleyDB64/lib -Alddlflags=-mcpu=v9 -m64 -L/usr/lib/sparcv9 -L/usr/local/lib/sparcv9 -L/usr/local/BerkeleyDB64/lib -L/usr/lib -L/usr/local/lib -R/usr/lib/sparcv9 -R/usr/local/lib/sparcv9 -R/usr/local/BerkeleyDB64/lib' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-mcpu=v9 -m64 -Wa,-xarch=v9 -mcpu=v9 -m64 -I/usr/local/include/v9 -I/usr/local/include -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/local/BerkeleyDB64/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-mcpu=v9 -m64 -Wa,-xarch=v9 -mcpu=v9 -m64 -I/usr/local/include/v9 -I/usr/local/include -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/local/BerkeleyDB64/include' ccversion='', gccversion='3.4.2', gccosandvers='solaris2.9' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=87654321 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags =' -m64 -mcpu=v9 -m64 -L/usr/lib/sparcv9 -L/usr/local/lib/sparcv9 -L/usr/local/BerkeleyDB64/lib -L/usr/lib -L/usr/local/lib -R/usr/lib/sparcv9 -R/usr/local/lib/sparcv9 -R/usr/local/BerkeleyDB64/lib ' libpth=/usr/lib/sparcv9 /usr/local/lib/sparcv9 /usr/local/BerkeleyDB64/lib libs=-lsocket -lnsl -ldb -ldl -lm -lc -lpthread perllibs=-lsocket -lnsl -ldl -lm -lc -lpthread libc=/usr/lib/sparcv9/libc.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' -R /usr/local/lib/perl5/5.8.7/sun4-solaris-64/CORE' cccdlflags='-fPIC', lddlflags=' -G -m64 -mcpu=v9 -m64 -L/usr/lib/sparcv9 -L/usr/local/lib/sparcv9 -L/usr/local/BerkeleyDB64/lib -L/usr/lib -L/usr/local/lib -R/usr/lib/sparcv9 -R/usr/local/lib/sparcv9 -R/usr/local/BerkeleyDB64/lib' Characteristics of this binary (from libperl): Compile-time options: USE_64_BIT_INT USE_64_BIT_ALL USE_LARGE_FILES Built under solaris Compiled at Sep 16 2005 17:16:32 @INC: /usr/local/lib/perl5/5.8.7/sun4-solaris-64 /usr/local/lib/perl5/5.8.7 /usr/local/lib/perl5/site_perl/5.8.7/sun4-solaris-64 /usr/local/lib/perl5/site_perl/5.8.7 /usr/local/lib/perl5/site_perl .

Thu Sep 06 15:53:01 2007 jpeacock [...] cpan.org - Taken

Thu Sep 06 16:30:17 2007 jpeacock [...] cpan.org - Correspondence added

On Thu Sep 06 12:20:58 2007, RYBSKEJ wrote: Show quoted text

> Since as_int() method uses multiplication to express the integer > part of the number, the CPU must translate this to a natural floating > point number and perform the operation. Unfortunately, this > introduces floating point approximation errors, leading to as_int > amounts that report *incorrectly* compared to their actual value as > stored in Math::Currency class.

Um, no, that would just be a symptom of my having done this without thinking about it carefully enough. The whole point of using Math::BigFloat/BigInt as a superclass for Math::Currency is so that I don't have to worry about rounding errors (nor play the kind of games you did to get it to "work"). Try this instead: === lib/Math/Currency.pm ================================================================== --- lib/Math/Currency.pm (revision 647) +++ lib/Math/Currency.pm (local) @@ -304,8 +304,11 @@ sub as_int { my $self = shift; - return Math::BigInt->new( - $self->as_float * 10**$self->format()->{FRAC_DIGITS} )->bstr; + my $format = $self->format; + my $bint = Math::BigInt->new($self->copy-> + bfround( -$format->{FRAC_DIGITS})-> + bmul(10**$format->{FRAC_DIGITS})->SUPER::bstr); + return $bint->bstr; } # we override the default here because we only want to compare the precision of and let me know if that fixes your problems. John

Thu Sep 06 16:30:42 2007 The RT System itself - Status changed from 'new' to 'open'

Sun Sep 09 22:27:07 2007 RYBSKEJ [...] cpan.org - Correspondence added

From:

RYBSKEJ [...] cpan.org

On Thu Sep 06 16:30:17 2007, JPEACOCK wrote: Show quoted text

> > Try this instead: > > === lib/Math/Currency.pm > ================================================================== > --- lib/Math/Currency.pm (revision 647) > +++ lib/Math/Currency.pm (local) > @@ -304,8 +304,11 @@ > > sub as_int { > my $self = shift; > - return Math::BigInt->new( > - $self->as_float * 10**$self->format()->{FRAC_DIGITS} )->bstr; > + my $format = $self->format; > + my $bint = Math::BigInt->new($self->copy-> > + bfround( -$format->{FRAC_DIGITS})-> > + bmul(10**$format->{FRAC_DIGITS})->SUPER::bstr); > + return $bint->bstr; > } > > # we override the default here because we only want to compare the > precision of > > and let me know if that fixes your problems. >

John, I appreciate the use of Math::BigFloat methods to solve the problem, as I also overlooked them when I made my quick patch. As a precaution, I tested your patch from $0 to to $1M today with no fp errors, so I can say with a high degree of confidence that your fix solves the problem correctly and portably. The only down side is the extra overhead Math::BigFloat methods introduce here, as shown in the Benchmark comparisons below: 44% slower than the semi-kludgy method I used (which was about 10% slower than the original, buggy as_int method). (Performance is an important consideration for the software I'm using this for, as it handles millions of financial transactions daily.) Perhaps this (or Math::BigFloat) can be optimized to better handle this in the future. (Note: I did not yet test if using libgmp with Math::BigInt would improve on this performance.) -Eric $ # unmodified Math::Currency 0.4502 $ perl -MBenchmark=timethis -MMath::Currency -e 'timethis(-5, sub { Math::Currency->new($i++.'.'.50)->as_int(); });' timethis for 5: 5 wallclock secs ( 4.52 usr + 0.48 sys = 5.00 CPU) @ 1770.40/s (n=8852) $ # Math::Currency 0.4502 with Eric's as_int() rough approximation error trim method $ perl -MBenchmark=timethis -MMath::Currency -e 'timethis(-5, sub { Math::Currency->new($i++.'.'.50)->as_int(); });' timethis for 5: 5 wallclock secs ( 4.80 usr + 0.28 sys = 5.08 CPU) @ 1609.02/s (n=8169) $ # Math::Currency 0.4502 with as_int() author-recommended changes $ perl -MBenchmark=timethis -MMath::Currency -e 'timethis(-5, sub { Math::Currency->new($i++.'.'.50)->as_int(); });' timethis for 5: 5 wallclock secs ( 4.67 usr + 0.39 sys = 5.06 CPU) @ 905.57/s (n=4584)

Mon Sep 10 05:04:28 2007 zenoparadoxus [...] gmail.com - Correspondence added

Subject:	Re: [rt.cpan.org #29210] as_int() method can return incorrect value
Date:	Mon, 10 Sep 2007 05:02:27 -0400
To:	bug-Math-Currency [...] rt.cpan.org
From:	"John Peacock" <zenoparadoxus [...] gmail.com>

Message body is not shown because sender requested not to inline it.

On 9/9/07, via RT <bug-Math-Currency@rt.cpan.org> wrote: Show quoted text

> > The only down side is the extra overhead Math::BigFloat methods > introduce here, as shown in the Benchmark comparisons below: 44% slower > than the semi-kludgy method I used (which was about 10% slower than the > original, buggy as_int method). (Performance is an important > consideration for the software I'm using this for, as it handles > millions of financial transactions daily.) >

A) You're making me very nervous talking about using my pathetic little class to perform millions of financial transactions daily... :-o B) I had another idea for efficiency - try the attached (which my my reckoning is even faster than the unpatched version)... John

Sat Sep 22 00:54:04 2007 RYBSKEJ [...] cpan.org - Correspondence added

On Mon Sep 10 05:04:28 2007, zenoparadoxus@gmail.com wrote: Show quoted text

> > A) You're making me very nervous talking about using my pathetic little > class to perform millions of financial transactions daily... :-o > > B) I had another idea for efficiency - try the attached (which my my > reckoning is even faster than the unpatched version)... > > John

John, Perhaps we're approaching this problem too pragmatically. How about the following solution: 296,297c296,298 < return Math::BigInt->new( < $self->as_float * 10**$self->format()->{FRAC_DIGITS} )->bstr; --- Show quoted text

> (my $str = $self->as_float) =~ s/\.//o; > $str =~ s/^(\-?)0+/$1/o; > return $str eq '' ? '0' : $str;

perl -MBenchmark=timethis -MMath::Currency -e 'timethis(-5, sub { Math::Currency->new($i++.'.'.50)->as_int(); });' timethis for 5: 5 wallclock secs ( 4.55 usr + 0.86 sys = 5.41 CPU) @ 2155.97/s (n=11653) It's the fastest so far, yet still fully integer-accurate (positive and negative). I ran through a quick test of -2000.00 to 2000.00 without any errors. For a full performance comparison, I ran an as_int() comparison test of Math::BigFloat vs. Math::Currency: # Math::BigFloat (note: doesn't round like Math::Currency does) $ perl -MBenchmark=timethis -MMath::Currency -e 'my $n=Math::BigFloat->new("1.50"); timethis(-5, sub { $n->as_int(); });' timethis for 5: 6 wallclock secs ( 5.31 usr + 0.00 sys = 5.31 CPU) @ 9177.87/s (n=48762) # with the above-documented patch $ perl -MBenchmark=timethis -MMath::Currency -e 'my $n=Math::Currency->new("1.50"); timethis(-5, sub { $n->as_int(); });' timethis for 5: 5 wallclock secs ( 4.36 usr + 0.75 sys = 5.11 CPU) @ 3822.27/s (n=19528) # with your suggested patch $ perl -MBenchmark=timethis -MMath::Currency -e 'my $n=Math::Currency->new("1.50"); timethis(-5, sub { $n->as_int(); });' timethis for 5: 5 wallclock secs ( 4.84 usr + 0.33 sys = 5.17 CPU) @ 631.74/s (n=3268) At best, performance is 41.6% of the base class, so there's still room for improvement. To really wanted to maximize performance (and if you don't mind using Math::BigInt internal methods instead of public methods) you could probably emulate Math::BigFloat's as_int() (alias for as_number) method, with currency format-specific rounding. But for now, if you're satisfied with the above patch, I'd say go ahead and integrate it immediately into your next release so that we can get a reliable as_int() method. I'll continue to research ways to make as_int even faster. Thanks, Eric

Sun Sep 23 08:01:42 2007 john.peacock [...] havurah-software.org - Correspondence added

Subject:	Re: [rt.cpan.org #29210] as_int() method can return incorrect value
Date:	Sun, 23 Sep 2007 08:01:44 -0400
To:	bug-Math-Currency [...] rt.cpan.org
From:	John Peacock <john.peacock [...] havurah-software.org>

Eric Rybski via RT wrote: Show quoted text

> Perhaps we're approaching this problem too pragmatically. How about > the following solution: > > 296,297c296,298 > < return Math::BigInt->new( > < $self->as_float * 10**$self->format()->{FRAC_DIGITS} )->bstr; > ---

>> (my $str = $self->as_float) =~ s/\.//o; >> $str =~ s/^(\-?)0+/$1/o; >> return $str eq '' ? '0' : $str;

I thought that was going to be cheating, since you aren't explicitly dealing with the case where the object as stored might have more accuracy than FRAC_DIGITS. But then I realized that as_float already takes care of that. Show quoted text

> But for now, if you're satisfied with the above patch, I'd say go ahead > and integrate it immediately into your next release so that we can get a > reliable as_int() method. I'll continue to research ways to make as_int > even faster.

I'll release that to CPAN now (as soon as I can craft a test case that fails with the old code on a 32-bit platform, so I can prevent regressions later on). Any further performance improvements will need to taget as_float() now, since the above is as straightforward as it could be... John

Wed Jan 30 21:10:28 2008 jpeacock [...] cpan.org - Correspondence added

This was resolved with 0.46 (but I never closed the ticket).

Wed Jan 30 21:10:33 2008 jpeacock [...] cpan.org - Status changed from 'open' to 'resolved'