Skip Menu |

This queue is for tickets about the Archive-Zip CPAN distribution.

Report information
The Basics
Id: 12184
Status: open
Priority: 0/
Queue: Archive-Zip

People
Owner: Nobody in particular
Requestors: 3rkk-ufis [...] spamex.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.15_02
Fixed in: (no value)



Subject: 1.15_02 unnecessarily creates file handles for directory members
Distribution name and version: Archive-Zip-1.15_02 perl -v: v5.8.5 built for i386-linux-thread-multi uname -a: Linux ingrid.hq.netapp.com 2.6.9-1.667 #1 Tue Nov 2 14:41:25 EST 2004 i686 i686 i386 GNU/Linux When a zip archive is read by readFromFileHandle, a new file handle is created for every directory member. Here's the flow: Archive::Zip::Archive::readFromFileHandle contains $status = $newMember->endRead(); return $status if $status != AZ_OK; $newMember->_becomeDirectoryIfNecessary(); endRead() resolves to Archive::Zip::FileMember::endRead This method contains undef $self->{'fh'}; # _closeFile(); thus dropping the reference to the file handle. readFromFileHandle then executes $newMember->_becomeDirectoryIfNecessary(); _becomeDirectoryIfNecessary() resolves to Archive::Zip::Member::_becomeDirectoryIfNecessary This method contains $self->_become(DIRECTORYMEMBERCLASS) if $self->isDirectory(); For a directory member, _become() is called. This resolves to Archive::Zip::ZipFileMember::_become This method contains if ( _isSeekable( $self->fh() ) ) The fh() call resolves to Archive::Zip::FileMember::fh This method contains $self->_openFile() if !defined( $self->{'fh'} ) || !$self->{'fh'}->opened(); Since $self->{'fh'} is not defined, having been undefined back in endRead(), _openFile() is called. This resolves to Archive::Zip::FileMember::_openFile This method contains my ( $status, $fh ) = _newFileHandle( $self->externalFileName(), 'r' ); which creates a new file handle. For my application, this is not just an inefficiency. I'm passing my own file handle object to readFromFileHandle instead of the usual IO::File object. I want my file handle object used for all I/O, not replaced by an IO::File object. I used the attached patch to get around the problem, but there may well be a better solution.
Download 062
application/octet-stream 437b

Message body not shown because it is not plain text.

Trying to clean up some RT tickets here. Is this still an issue? Does the latest revision fix the problem?
Subject: Re: [rt.cpan.org #12184] 1.15_02 unnecessarily creates file handles for directory members
Date: Thu, 26 Apr 2012 07:14:16 -0700
To: bug-Archive-Zip [...] rt.cpan.org
From: Howard Gayle <3rkk-ufis [...] spamex.com>
On Thu, 19 Apr 2012 11:14:06 -0400, Brendan Byrd via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=12184 > > > Trying to clean up some RT tickets here. Is this still an issue? Does > the latest revision fix the problem?
I no longer have access to the software using this package.
From: bergner [...] cs.umu.se
On Thu Apr 19 11:14:05 2012, BBYRD wrote: Show quoted text
> Trying to clean up some RT tickets here. Is this still an issue? Does > the latest revision fix the problem?
I can confirm that both version 1.30 and 1.31_03 exhibit this behaviour. Easy to reproduce. I've not been digging very deep on this but I haven't seen an obvious need for the undef $self->{'fh'} in Archive::Zip::FileMember's endRead based on the tests I've done. Here is a trivial example that shows the large amount of open() calls using an approximately 75 MB large zip file with 543 directories in it: $ wget http://google-web-toolkit.googlecode.com/files/gwt-2.3.0.zip $ strace -e open /usr/bin/perl -e 'use Archive::Zip; $a=Archive::Zip->new("gwt-2.3.0.zip"); @m=$a->members();' 2>&1 | grep gwt- 2.3.0.zip | wc -l 544 Commenting out the undef $self->{'fh'} in FileMember's endRead yields "1" instead of "544" (with the same set of members being returned). Every time the file is opened it reads various data from the file as well so it is not just redundant calls to open. That behaviour can easily be seen by using strace -e open,read instead. Kind regards, Marcus