Subject: | file_md5_hex does not memory efficient |
Date: | Fri, 09 Nov 2007 11:46:14 -0500 |
To: | bug-Digest-MD5-File [...] rt.cpan.org |
From: | Phil Durbin <pdurbin [...] sesda2.com> |
When I calculate the MD5 sum of a large file I get Out of Memory errors
where using
Digest::MD5::File::file_md5_hex for example , I have a 500GB file :
use Digest::MD5::File qw(file_md5_hex);
$file_md5=file_md5_hex('OMI-Aura_L1-OML1BRVG_2007m0929t1831-o17060_v921-2007m1106t180547.he4');
print "$file_md5 \n";
Returns :
Out of memory!
While
use Digest::MD5::File;
my $n= Digest::MD5->new();
$n->addpath('OMI-Aura_L1-OML1BRVG_2007m0929t1831-o17060_v921-2007m1106t180547.he4');
print $n->hexdigest , "\n";
returns :
4369413dc6a24bba1ea00c545f908c6a
which is the desired result .
Looking at the code in Digest::MD5::File::file_md5_hex is see the
problem is it calculates the
MD5 as a single line which of course tries to load it into memory :
return Digest::MD5::md5_hex(<$fh>)
I would suggest either updating the documentation to warn that it is
'slurping' the file or simply
call addpath method and return the hexdigest .
--
Phil Durbin
ADNET Systems , Inc.
7515 Mission Drive, Suite A1C1
Lanham, MD 20706
301-352-4669 (phone)
301-352-0437 (fax)
pdurbin@sesda2.com