Skip Menu |

This queue is for tickets about the Bloom-Filter CPAN distribution.

Report information
The Basics
Id: 6852
Status: resolved
Priority: 0/
Queue: Bloom-Filter

People
Owner: Nobody in particular
Requestors: linux [...] johannes-niess.de
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.02
Fixed in: 1.0



Subject: Processing time for high key counts
Maciej, Thank you very much for Bloom::Filter 0.2 . I'm trying to use it with high key counts. With $max=1e3 (i.e keys) in the attached programme, I get processing times of 1.1 seconds (P4 2.4 GHz). With $max=1e4 it takes 43 seconds. That's a pretty bad scaling factor. Is this imminent to Bloom filters? The overall performance is in my eyes disappointing. Profiling tells that all the time is spent in _make_bitmask . I expected most of the time spent doing the sha1 hashes. What's going on here? Thank you very much for any feedback. Johannes Nieß bash-2.05b$ cat bloom.pl #! /usr/bin/perl -w use warnings; use strict; use Bloom::Filter; my $max = 0.5e4; my $bf = Bloom::Filter->new( capacity => $max, error_rate => 1e-3 ); $bf->add(1..$max); $bf->check(1..$max*2); bash-2.05b$ dprofpp -p ./bloom.pl Total Elapsed Time = 12.66966 Seconds User+System Time = 12.59966 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 98.9 12.47 12.860 15000 0.0008 0.0009 Bloom::Filter::_make_bitmask 3.10 0.390 0.390 150000 0.0000 0.0000 Digest::SHA1::sha1 1.43 0.180 8.760 1 0.1800 8.7599 Bloom::Filter::check 0.71 0.090 4.370 1 0.0900 4.3700 Bloom::Filter::add 0.16 0.020 0.020 4 0.0050 0.0049 Bloom::Filter::BEGIN 0.08 0.010 0.030 1 0.0100 0.0298 main::BEGIN 0.00 - -0.000 1 - - warnings::BEGIN bash-2.05b$ perl -v This is perl, v5.8.2 built for i686-linux-thread-multi (on a Gentoo box with perl build from source)
Hi Johannes, Thanks for the bug report, I'll investigate and try to see why that bitmask subroutine is so slow. Send me email at maciej@ceglowski.com with your actual email address, if you like, and I can contact you directly later. Best, maciej
From: dvryaboy [...] cpan.org
Probably too late for the submitter, but for anyone browsing RT: This was fixed in the Feb 2007 release. On Mon Jul 05 16:09:29 2004, guest wrote: Show quoted text
> Maciej, > > Thank you very much for Bloom::Filter 0.2 . I'm trying to use it with > high key counts. With $max=1e3 (i.e keys) in the attached > programme, I get processing times of 1.1 seconds (P4 2.4 GHz). With > $max=1e4 it takes 43 seconds. That's a pretty bad scaling factor. > Is this imminent to Bloom filters? > > The overall performance is in my eyes disappointing. Profiling tells > that all the time is spent in _make_bitmask . I expected most of > the time spent doing the sha1 hashes. What's going on here? > > Thank you very much for any feedback. > > Johannes Nieß > > > bash-2.05b$ cat bloom.pl > #! /usr/bin/perl -w > use warnings; > use strict; > use Bloom::Filter; > > my $max = 0.5e4; > my $bf = Bloom::Filter->new( capacity => $max, error_rate => 1e-3 ); > $bf->add(1..$max); > $bf->check(1..$max*2); > > bash-2.05b$ dprofpp -p ./bloom.pl > Total Elapsed Time = 12.66966 Seconds > User+System Time = 12.59966 Seconds > Exclusive Times > %Time ExclSec CumulS #Calls sec/call Csec/c Name > 98.9 12.47 12.860 15000 0.0008 0.0009 > Bloom::Filter::_make_bitmask > 3.10 0.390 0.390 150000 0.0000 0.0000 Digest::SHA1::sha1 > 1.43 0.180 8.760 1 0.1800 8.7599 Bloom::Filter::check > 0.71 0.090 4.370 1 0.0900 4.3700 Bloom::Filter::add > 0.16 0.020 0.020 4 0.0050 0.0049 Bloom::Filter::BEGIN > 0.08 0.010 0.030 1 0.0100 0.0298 main::BEGIN > 0.00 - -0.000 1 - - warnings::BEGIN > > > bash-2.05b$ perl -v > > This is perl, v5.8.2 built for i686-linux-thread-multi > > (on a Gentoo box with perl build from source)