Skip Menu |

This queue is for tickets about the DBM-Deep CPAN distribution.

Report information
The Basics
Id: 70704
Status: open
Priority: 0/
Queue: DBM-Deep

People
Owner: Nobody in particular
Requestors: frech.christian [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.0004
Fixed in: (no value)



Subject: Reference.pm fails with 'Can't locate object method "find_md5"'
I am using DBM::Deep in a multi-threaded environment to store simple key/value pairs in a hash. From time to time it happens that a hash entry in the db goes corrupt for an unknown reason and then this entry is no longer accessible. If one tries to read such a corrupted entry, the following error message appears: Can't locate object method "find_md5" via package "DBM::Deep::Sector::File::Scalar" at /home/cfa24/perl/bioperl/DBM/Deep/Sector/File/Reference.pm line 295. I attached a database file that contains such a corrupted hash entry. Here is the code snippet that reproduces the error: ---- my $db = new DBM::Deep ( file => "similarity_search_fasta35.tasks", locking => 1, autoflush => 1 ); $db->exists("pfa_PF14_0073"); ---- I am using Perl v5.8.8. Linux version: 2.6.23.17-88.fc7 #1 SMP Thu May 15 00:02:29 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
Subject: similarity_search_fasta35.tasks
Download similarity_search_fasta35.tasks
application/octet-stream 416.9k

Message body not shown because it is not plain text.

On Fri Sep 02 16:34:02 2011, https://www.google.com/accounts/o8/id? id=AItOawmupDtd0dUa0UImxpx7bPxKoGG8NmVnsOA wrote: Show quoted text
> I am using DBM::Deep in a multi-threaded environment to store simple > key/value pairs in a hash. From time to time it happens that a hash > entry in the db goes corrupt for an unknown reason and then this entry > is no longer accessible. If one tries to read such a corrupted entry, > the following error message appears:
I don’t know that DBM::Deep can do much if the database is corrupt. Even if we do make it detect that, you’ll get an error of some sort. However, it should not be corrupted to begin with. Is there any chance you could try and figure out why it’s becoming corrupt? Do you have multiple threads writing to the same database at the same time? I’ll do some experiments myself, but I can’t make any promises.
On Sun Sep 04 15:02:27 2011, SPROUT wrote: Show quoted text
> On Fri Sep 02 16:34:02 2011, https://www.google.com/accounts/o8/id? > id=AItOawmupDtd0dUa0UImxpx7bPxKoGG8NmVnsOA wrote:
> > I am using DBM::Deep in a multi-threaded environment to store simple > > key/value pairs in a hash. From time to time it happens that a hash > > entry in the db goes corrupt for an unknown reason and then this
> entry
> > is no longer accessible. If one tries to read such a corrupted
> entry,
> > the following error message appears:
> > I don’t know that DBM::Deep can do much if the database is corrupt. > Even if we do make it > detect that, you’ll get an error of some sort. > > However, it should not be corrupted to begin with. Is there any > chance you could try and figure > out why it’s becoming corrupt? Do you have multiple threads writing > to the same database at > the same time? I’ll do some experiments myself, but I can’t make any > promises.
I’ve just found that creating a DBM::Deep object *before* creating a new thread, and then using that object in both threads, will cause database corruption almost immediately. Maybe I should update the documentation to warn against that. Is that what you were doing, by any chance?
Show quoted text
> Do you have multiple threads writing to the same database at the same
time? Yes, I do have multiple processes accessing the same database at the same time.
Show quoted text
> I’ve just found that creating a DBM::Deep object *before* creating a > new thread, and then > using that object in both threads, will cause database corruption > almost immediately. Maybe I > should update the documentation to warn against that. > > Is that what you were doing, by any chance?
Here is some more information: I am using DBM::Deep in a computing grid environment where multiple processes access the same DBM::Deep database at the same time. Each process creates its own DBM::Deep object to access this database. I found out that in my particular environment flock() on the DBM::Deep database file does not work, because the database file is accessed over NFS. This probably explains why the database goes corrupt. I am not sure if in such an environment it is safe at all to use DBM::Deep. Probably not. Can you confirm that? I am thinking now of switching to a MySQL database to synchronize my processes and to have thread-safe read/write operations.
On Mon Sep 05 22:47:42 2011, https://www.google.com/accounts/o8/id? id=AItOawmupDtd0dUa0UImxpx7bPxKoGG8NmVnsOA wrote: Show quoted text
>
> > I’ve just found that creating a DBM::Deep object *before* creating a > > new thread, and then > > using that object in both threads, will cause database corruption > > almost immediately. Maybe I > > should update the documentation to warn against that. > > > > Is that what you were doing, by any chance?
> > Here is some more information: I am using DBM::Deep in a computing grid > environment where multiple processes access the same DBM::Deep database > at the same time. Each process creates its own DBM::Deep object to > access this database. > > I found out that in my particular environment flock() on the DBM::Deep > database file does not work, because the database file is accessed over > NFS. This probably explains why the database goes corrupt. I am not sure > if in such an environment it is safe at all to use DBM::Deep. Probably > not. Can you confirm that?
Yes. DBM::Deep relies on being able to lock the file. In fact, without dedicated server software, I don’t think it’s possible for any database to work safely in that environment. I don’t know enough about file locking, but I would like to get it working some day if possible. I see this in perlfunc (under flock): Note also that some versions of "flock" cannot lock things over the network; you would need to use the more system‐specific "fcntl" for that. If you like you can force Perl to ignore your system’s flock(2) function, and so provide its own fcntl(2)−based emulation, by passing the switch "−Ud_flock" to the Configure program when you configure perl. Show quoted text
> I am thinking now of switching to a MySQL database to synchronize my > processes and to have thread-safe read/write operations.
That should work, if I understand MySQL correctly.