Subject: | Bio::DB::Sam objects crashes threads |
While trying to speed up an analysis using multi-threading I realized that the Bio::DB::Sam
objects seems to not be thread safe... (Hmm... maybe I am not using the exact perfect
semantic here...)
Attached is a very simple mock script that will run an analysis using 5 threads. If you don’t
provide Bam file(s) as argument, the @bams variable stay undef and the program terminates
normally. As soon as you instantiate a bam object by passing the location of a file as
arguments, the threads crashes during the joining process. Interestingly enough, the
Bio::DB::Bam does not get passed or used by the script, the script only instantiate the object
without touching it (or even sending it to the threads...).
I have not investigate the objects to find the source of the problem, but I thought it could be
important to see if that can be fixed.
thanks
Marco
Subject: | test3.pl |
#!/usr/bin/perl
use strict;
use warnings;
use Bio::DB::Sam;
use threads;
use Thread::Queue;
MAIN:{
my $threads = 5;
my @bams;
@bams = map{ Bio::DB::Sam->new(-bam => $_)} @ARGV if @ARGV;
my %chrs = (A => [1..10],
B => [30..45],
C => [101..123]);
for my $chr (keys %chrs){
my @tc = (@{$chrs{$chr}});
my $q = Thread::Queue->new;
$q->enqueue(@tc);
my $num_workers = @tc < $threads ? @tc : $threads; # no need to be wasteful :)
for (1 .. $num_workers) {
threads->new(\&worker, $q,$chr,$_);
}
$_->join for threads->list;
print "Done with chromosome $chr\n";
}
exit(0);
}
sub worker {
my $q = shift;
my $chr = shift;
my $th = shift;
while(my $feat = $q->dequeue_nb){
print "Analyzing feature $feat from chromosome $chr in thread $th\n";
sleep 2;
}
}