Hi Samuel,
On 22 Feb 2010, at 11:32, Samuel Mutel via RT wrote:
Show quoted text> I want to compress a lot of files (37011) and the size of all this
> files
> is 1.2 Go.
> My script use 90% of the memory and it crash with the message "Out of
> memory".
>
> Here is my code :
>
> my $size = @files;
> print $size . "\n"; => 37011
>
> my $tar = Archive::Tar->new();
> $tar->add_files(@files);
> $tar->write($BACKUP_ARCHIVE . ".tar.bz2", COMPRESS_BZIP);
>
> Is-it a bug of Archive:Tar ?
> If not how can I improve this piece of code ?
Archive::Tar doesn't have a way to 'append to a tarfile' currently;
what's happening is that you're reading all the files into memory
when you call '$tar->add_files'. This will create at least 1.2 gb
of data in ram, and cause the OOM to happen.
To support this use case, we would need to change Archive::Tar.
A quick braindump here for myself and my comaintainers on what
we could do:
1) lazy load $at_file->data. This will only work for on-disk files,
not for items in a tarball or data that's passed in via the script.
This would save us loading the file contents into memory, but not
the meta data
2) add an extra interface, along these lines:
$tar = A::T->new
$fh = $tar->open( 'file i want to write to' )
$tar->append( $_ ) for @some_files;
$tar->close;
This would create only 1 A::T::File object and the memory
footprint would be that of the biggest file + metadata, and
we might even be able to shortcut on that.
So in short, there's no way to avoid this at the moment while writing
an archive unless we make some code improvements to Archive::Tar :(
Cheers,
--
Jos Boumans
'Real programmers use "cat > a.out"'