Skip Menu |

This queue is for tickets about the Archive-Tar CPAN distribution.

Report information
The Basics
Id: 54868
Status: open
Priority: 0/
Queue: Archive-Tar

People
Owner: Nobody in particular
Requestors: samuel.mutel [...] free.fr
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.54
Fixed in: (no value)



Subject: Out of memory error
I want to compress a lot of files (37011) and the size of all this files is 1.2 Go. My script use 90% of the memory and it crash with the message "Out of memory". Here is my code : my $size = @files; print $size . "\n"; => 37011 my $tar = Archive::Tar->new(); $tar->add_files(@files); $tar->write($BACKUP_ARCHIVE . ".tar.bz2", COMPRESS_BZIP); Is-it a bug of Archive:Tar ? If not how can I improve this piece of code ?
Subject: Re: [rt.cpan.org #54868] Out of memory error
Date: Mon, 22 Feb 2010 12:58:11 +0000
To: bug-Archive-Tar [...] rt.cpan.org
From: "Jos I. Boumans" <jos [...] dwim.org>
Hi Samuel, On 22 Feb 2010, at 11:32, Samuel Mutel via RT wrote: Show quoted text
> I want to compress a lot of files (37011) and the size of all this > files > is 1.2 Go. > My script use 90% of the memory and it crash with the message "Out of > memory". > > Here is my code : > > my $size = @files; > print $size . "\n"; => 37011 > > my $tar = Archive::Tar->new(); > $tar->add_files(@files); > $tar->write($BACKUP_ARCHIVE . ".tar.bz2", COMPRESS_BZIP); > > Is-it a bug of Archive:Tar ? > If not how can I improve this piece of code ?
Archive::Tar doesn't have a way to 'append to a tarfile' currently; what's happening is that you're reading all the files into memory when you call '$tar->add_files'. This will create at least 1.2 gb of data in ram, and cause the OOM to happen. To support this use case, we would need to change Archive::Tar. A quick braindump here for myself and my comaintainers on what we could do: 1) lazy load $at_file->data. This will only work for on-disk files, not for items in a tarball or data that's passed in via the script. This would save us loading the file contents into memory, but not the meta data 2) add an extra interface, along these lines: $tar = A::T->new $fh = $tar->open( 'file i want to write to' ) $tar->append( $_ ) for @some_files; $tar->close; This would create only 1 A::T::File object and the memory footprint would be that of the biggest file + metadata, and we might even be able to shortcut on that. So in short, there's no way to avoid this at the moment while writing an archive unless we make some code improvements to Archive::Tar :( Cheers, -- Jos Boumans 'Real programmers use "cat > a.out"'