Subject: | Memory leak when invoking messages like numberOfMessages() and endTimeEstimate() on a thread object |
Hi,
When processing the mbox file at http://www.spinics.net/lists/kernel/mbox/0912.mbox.gz
with Mail::Box 2.093 (Perl v5.10.0 built for x86_64-linux-gnu-thread-multi on 2.6.31-19-
generic #56-Ubuntu SMP Thu Jan 28 02:39:34 UTC 2010 x86_64 GNU/Linux) to extract all
threads, I keep on getting a huge memory leak that quickly grows to consume any RAM it can
get and crashes my machine.
In particular, the leak only happens after construction of email threads when invoking
methods such as numberOfMessages() and endTimeEstimate() that require traversal through
all messages of a thread. Messages like startTimeEstimate() do not cause a problem.
A quick hack to avoid the leak, is to remove all "References:" lines from the mbox file, but this
makes the thread construction less precise.
I attached an example program exhibiting the bug (I added a comment before the offending
function calls):
gunzip 0912.mbox.gz ; ./bug.pl 0912.mbox > test.txt
I'd be happy to provide more information, if needed.
Kind regards,
Bram Adams
Subject: | bug.pl |
#!/usr/bin/perl
my @args=@ARGV;
use Time::Local;
use Mail::Box::Manager;
#process an mbox file to reconstruct threads
my @folders=();
my $mgr = Mail::Box::Manager->new(timespan => 'EVER');
for my $arg (@args){
print STDERR "Pushing ${arg}\n";
push(@folders,$mgr->open(folder => $arg));
}
print STDERR "Extracting threads...\n";
my $threads = $mgr->threads(folders => \@folders);
print STDERR "Extracting done...\n";
my @sorted_threads=$threads->all;
my $i=1;
my $total_nr=$#sorted_threads+1;
foreach my $thread (@sorted_threads) {
my $start=$thread->startTimeEstimate;
my $start_nice=construct_date($start);
#following three lines cause memory leak, unless References: lines are removed from mbox files!!!
my $end=$thread->endTimeEstimate;
my $end_nice=construct_date($end);
my $number=$thread->numberOfMessages;
print $thread->threadToString;
print "\n";
my $email=$thread->message;
unless($email->isDummy){
my @froms=$email->from;
my $sender=$froms[0]->address();
my $subject=$email->subject;
$subject =~ s/\s+/ /g;
$subject =~ s/^\[.+\][ :]//;
print "${sender},\"${subject}\",${start},${end},${number}\n";
}
}
sub construct_date{
my ($epoch)=@_;
my @months = ("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec");
my ($sec, $min, $hour, $day,$month,$year) = (localtime($epoch))[0,1,2,3,4,5,6];
# You can use 'gmtime' for GMT/UTC dates instead of 'localtime'
return "".$months[$month]." ".$day." ".$hour.":".$min.":".$sec." ".($year+1900);
}