Skip Menu |

This queue is for tickets about the Math-Primality CPAN distribution.

Report information
The Basics
Id: 78379
Status: open
Priority: 0/
Queue: Math-Primality

People
Owner: Nobody in particular
Requestors: DANAJ [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.04
Fixed in: (no value)



Subject: Math::Primality and semi-large numbers
Multiplication in the Miller Rabin test is not done mod n, so will (1) take far too long and consume huge amounts of memory, and (2) eventually crash unable to continue as the number of digits in residue become absurdly large. I suggest replacing all the code inside the map { } of is_strong_pseudoprime() with: Rmpz_powm($residue, $residue, GMP->new(2), $n); if (Rmpz_cmp($residue, $m) == 0) { debug "$_:$residue == $m => spsp!" if DEBUG; return 1; } Alternately one can do a multiply and mod, or pull the GMP '2' out of the loop. For example: time perl -Iblib/lib -Iblib/arch -MMath::Primality=:all -Mbigint -E '$n = (10**25)->bstr; do { $n = next_prime($n); } for (1..10); say $n;' Takes 9.5s with the old code, 0.04s with the new code, and of course produces the same result. 10**48 runs in 0.06s with the new code. With the old code, my GMP crashes after a bit over a minute, perhaps unable to get more than 32GB of memory to store the product of two 3.2 million digit numbers. The results with the new code run with 10 ** 100 and 10k next_primes match those of Math::Prime::Util::GMP, which uses different code for next_prime and the strong MR and strong Lucas tests. It also matches Pari/GP 'n=10^100; for (x=1,10000,n=nextprime(n+1)); print(n)' which uses completely different code.
Dana, Thanks for the bug report and patch! I'm surprised anyone knows about my module at all. You're definitely right and I'll be patching the is_strong_pseudoprime function soon. Thanks again, Bob On Sat Jul 14 22:58:44 2012, DANAJ wrote: Show quoted text
> Multiplication in the Miller Rabin test is not done mod n, so will (1) > take far too long and consume huge amounts of memory, and (2) eventually > crash unable to continue as the number of digits in residue become > absurdly large. > > I suggest replacing all the code inside the map { } of > is_strong_pseudoprime() with: > > Rmpz_powm($residue, $residue, GMP->new(2), $n); > if (Rmpz_cmp($residue, $m) == 0) { > debug "$_:$residue == $m => spsp!" if DEBUG; > return 1; > } > > Alternately one can do a multiply and mod, or pull the GMP '2' out of > the loop. For example: > > time perl -Iblib/lib -Iblib/arch -MMath::Primality=:all -Mbigint -E '$n > = (10**25)->bstr; do { $n = next_prime($n); } for (1..10); say $n;' > > Takes 9.5s with the old code, 0.04s with the new code, and of course > produces the same result. > > 10**48 runs in 0.06s with the new code. With the old code, my GMP > crashes after a bit over a minute, perhaps unable to get more than 32GB > of memory to store the product of two 3.2 million digit numbers. > > The results with the new code run with 10 ** 100 and 10k next_primes > match those of Math::Prime::Util::GMP, which uses different code for > next_prime and the strong MR and strong Lucas tests. It also matches > Pari/GP 'n=10^100; for (x=1,10000,n=nextprime(n+1)); print(n)' which > uses completely different code.
I'll add this as a note, though it's really a separate issue, and not exactly a bug. I would suggest adding a small test in is_prime, such as: return 0 if Rmpz_cmp_ui($n, 2) == -1; return _is_small_prime($n) if Rmpz_cmp_ui($n, 257) == -1; foreach my $d (2, 3, 5, 7, 11, 13, 17, 19, 23, 29) { return 0 if Rmpz_divisible_ui_p($n, $d); } before the MR tests. This will speed up is_prime a lot for generic numbers. The case of 2 could be replaced with a test for Rmpz_even_p, and all the rest could be done as a single GCD, but when I last tested, while mpz_gcd_ui was slightly faster in C, it was slower in Math::GMPz (and you need the newest version of Math::GMPz to get a return value from Rmpz_gcd_ui). If this isn't done (or maybe even if it is, prime_count will run over 2x faster if written like: $primes++ if $n >= 2; $primes++ if $n >= 3; $primes++ if $n >= 5; $primes++ if $n >= 7; $primes++ if $n >= 11; for (my $i = GMP->new(13); Rmpz_cmp($i, $n) <= 0; Rmpz_add_ui($i, $i, 2)) { $primes++ if ($i % 3) && ($i % 5) && ($i % 7) && ($i % 11) && is_prime($i); } return $primes; The 3/5/7/11 startup and mod test could probably be safely removed if is_prime has them. Unfortunately, even at 2-3x faster, this is still *thousands* of times slower than sieve + bit count code like Math::Prime::Util uses. That's not a trivial fix, and as mentioned in the prime_count details, for large numbers something like Meissel, Meissel-Lehmer, Lagarias-Miller-Odlyzko, or one of the improved methods will work better. Even then, the time beyond 10^20 gets impractical for most purposes so approximations become more useful.