Skip Menu |

This queue is for tickets about the Bloom-Faster CPAN distribution.

Report information
The Basics
Id: 62750
Status: open
Priority: 0/
Queue: Bloom-Faster

People
Owner: Nobody in particular
Requestors: justin.m.cassidy [...] nasa.gov
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: bitvector: memset after malloc, hash details
Date: Fri, 5 Nov 2010 16:44:48 -0500
To: "bug-Bloom-Faster [...] rt.cpan.org" <bug-Bloom-Faster [...] rt.cpan.org>
From: "Cassidy, Justin M. (ARC-IQ)[PEROT SYSTEMS]" <justin.m.cassidy [...] nasa.gov>
Hi, I've been using Bloom::Faster for a few years, to generate indexes for each minute of a stream of incoming data. When upgrading to 64-bit systems, I noticed some problems that were show-stoppers in both 1.4 and 1.7, and after staring at the code for a while found a fix. On my CentOS 5 system, around 30% of the time the bit vector for the hash would get malloc'ed and be full of existing bits set to one. This was confirmed by logging the hash insert values, and noticing no inserts would occur on the bit vectors which were already stuffed with ones. I assumed a similar problem might occur with other malloc'ed entries such as the array of salts, so I did memset there also... but I think sprintf makes this unnecessary. Less importantly, your jenkins.c file has a 32-bit-optimized version of Bob Jenkins' hash code. On his website, there's somewhat vague instructions for how to create a 64-bit version of this hash function that's slightly faster, and I did this also. I'm happy to send patches of what I did this weekend, but I don't consider myself a expert C or Perl guy. :) Thanks, Justin Built on: Ubuntu Karmic (2.6.31) Perl 5.10.0 gcc 4.4.1 Ran on: CentOS 5 x86_64 (2.6.18) perl 5.8.8
From: jtk [...] northwestern.edu
I'm interested in seeing the patch, thank you.