Subject: | Processing time for high key counts |
Maciej,
Thank you very much for Bloom::Filter 0.2 . I'm trying to use it with high key counts. With $max=1e3 (i.e keys) in the attached programme, I get processing times of 1.1 seconds (P4 2.4 GHz). With $max=1e4 it takes 43 seconds. That's a pretty bad scaling factor. Is this imminent to Bloom filters?
The overall performance is in my eyes disappointing. Profiling tells that all the time is spent in _make_bitmask . I expected most of the time spent doing the sha1 hashes. What's going on here?
Thank you very much for any feedback.
Johannes Nieß
bash-2.05b$ cat bloom.pl
#! /usr/bin/perl -w
use warnings;
use strict;
use Bloom::Filter;
my $max = 0.5e4;
my $bf = Bloom::Filter->new( capacity => $max, error_rate => 1e-3 );
$bf->add(1..$max);
$bf->check(1..$max*2);
bash-2.05b$ dprofpp -p ./bloom.pl
Total Elapsed Time = 12.66966 Seconds
User+System Time = 12.59966 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
98.9 12.47 12.860 15000 0.0008 0.0009 Bloom::Filter::_make_bitmask
3.10 0.390 0.390 150000 0.0000 0.0000 Digest::SHA1::sha1
1.43 0.180 8.760 1 0.1800 8.7599 Bloom::Filter::check
0.71 0.090 4.370 1 0.0900 4.3700 Bloom::Filter::add
0.16 0.020 0.020 4 0.0050 0.0049 Bloom::Filter::BEGIN
0.08 0.010 0.030 1 0.0100 0.0298 main::BEGIN
0.00 - -0.000 1 - - warnings::BEGIN
bash-2.05b$ perl -v
This is perl, v5.8.2 built for i686-linux-thread-multi
(on a Gentoo box with perl build from source)