Skip Menu |

This queue is for tickets about the Archive-Zip CPAN distribution.

Report information
The Basics
Id: 68446
Status: open
Worked: 2 hours (120 min)
Priority: 0/
Queue: Archive-Zip

People
Owner: Nobody in particular
Requestors: dwheeler [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Junk Added to Empty Files
Date: Tue, 24 May 2011 10:18:15 -0700
To: bug-archive-zip [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
The attached zip file was created for a sample Python project using this command: python setup.py sdist --format=gztar,zip It includes a file, __init__.py, of size zero. Using UnZip 5.52 on Mac OS X, it works great: % unzip python.zip Archive: python.zip inflating: Foo-0.2.0/setup.cfg inflating: Foo-0.2.0/PKG-INFO inflating: Foo-0.2.0/setup.py inflating: Foo-0.2.0/bar/__init__.py inflating: Foo-0.2.0/bar/baz.py inflating: Foo-0.2.0/Foo.egg-info/SOURCES.txt inflating: Foo-0.2.0/Foo.egg-info/PKG-INFO inflating: Foo-0.2.0/Foo.egg-info/requires.txt inflating: Foo-0.2.0/Foo.egg-info/dependency_links.txt inflating: Foo-0.2.0/Foo.egg-info/top_level.txt Now let's let Archive::Zip rewrite the zip file: perl -MArchive::Zip -E 'my $z = Archive::Zip->new; $z->read(shift); $z->writeToFileNamed("out.zip");' python.zip All we're doing is reading the zip file in and writing it out again. But that breaks the empty file: unzip out.zip Archive: out.zip inflating: Foo-0.2.0/setup.cfg inflating: Foo-0.2.0/PKG-INFO inflating: Foo-0.2.0/setup.py Foo-0.2.0/bar/__init__.py: ucsize 0 <> csize 2 for STORED entry continuing with "compressed" size value extracting: Foo-0.2.0/bar/__init__.py bad CRC 1a6cd7b3 (should be 00000000) inflating: Foo-0.2.0/bar/baz.py inflating: Foo-0.2.0/Foo.egg-info/SOURCES.txt inflating: Foo-0.2.0/Foo.egg-info/PKG-INFO inflating: Foo-0.2.0/Foo.egg-info/requires.txt inflating: Foo-0.2.0/Foo.egg-info/dependency_links.txt inflating: Foo-0.2.0/Foo.egg-info/top_level.txt And sure enough, __init__.py is now two bytes long: Foo-0.2.0/bar/__init__.py PK So Archive::Zip is shoving that "PK" in there. It ought not to do that, I think. I'll see if I can figure out why it's doing this and submit a patch. Best, David
Download python.zip
application/zip 2.8k

Message body not shown because it is not plain text.

Here's a test case; add the `python.zip` file provided above as `t/data/python.zip`. "PK" gets written to the file by this bit of code in _writeLocalFileHeader() in Member.pm: my $signatureData = pack( SIGNATURE_FORMAT, LOCAL_FILE_HEADER_SIGNATURE ); $self->_print($fh, $signatureData) or return _ioError("writing local header signature"); Curiously, if the file is read back in by Archive::Zip, __init__.py is properly 0 K. It's only when it's unzipped by the system `unzip` that it's 2 bytes. So somehow the file header generated by Archive::Zip is not compatible with other zip implementations. Some sort of difference in the implementation of the spec? Anyway, if someone can figure out a better test than using the system `unzip`, that would be a more robust test case for distribution. Thanks, David
Subject: 15_bug_68446.t
#!/usr/bin/perl use strict; BEGIN { $| = 1; $^W = 1; } use Archive::Zip qw( :ERROR_CODES ); use Test::More tests => 8; my $zip = Archive::Zip->new(); isa_ok( $zip, 'Archive::Zip' ); is( $zip->read('t/data/python.zip'), AZ_OK, 'Read file' ); is( $zip->extractTree( undef, 'extracted/python' ), AZ_OK, 'Extracted archive' ); ok( -d 'extracted/python/Foo-0.2.0', 'Checked directory' ); is( -s 'extracted/python/Foo-0.2.0/bar/__init__.py', 0, 'Checked empty file'); # Now rebuild it. is( $zip->writeToFileNamed('extracted/new_python.zip'), AZ_OK, 'Write file'); # Use system unzip to unzip it. The test does not fail when use use # Archive::Zip to unzip it. system qw(unzip -qqod extracted/new_python extracted/new_python.zip); # is( $zip->read('extracted/new_python.zip'), AZ_OK, 'Read new file' ); # is( $zip->extractTree( undef, 'extracted/new_python' ), AZ_OK, 'Extracted archive' ); # __init___.py should be 0 bytes. ok( -d 'extracted/new_python/Foo-0.2.0', 'Checked directory' ); is( -s 'extracted/new_python/Foo-0.2.0/bar/__init__.py', 0, 'Checked empty file');
Subject: Re: [rt.cpan.org #68446] AutoReply: Junk Added to Empty Files
Date: Sun, 12 Jun 2011 21:08:10 -0700
To: bug-Archive-Zip [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On May 24, 2011, at 10:18 AM, Bugs in Archive-Zip via RT wrote: Show quoted text
> So Archive::Zip is shoving that "PK" in there. It ought not to do that, I think. I'll see if I can figure out why it's doing this and submit a patch.
Man, I couldn't make heads or tails of it. I got as far as figuring out that "PK" seems to be some kind of header, but not why it appears in the body of the document rather than remaining a header. Anyone else have an idea? Best, David
Hello David, Sorry to reply late. Saw the ticket today. Thank you for the bug report as well as for the tests. Analysing the hexdump and Zip-Parser's (https://github.com/alanhaggai/Zip-Parser) output it seems that the archive python.zip is corrupted. In python.zip's local file header and central directory file header for Foo-0.2.0/bar/__init__.py, compressed size is set to 2 (the compressed data being 0x03 and 0x00). Uncompressed size is set to 0! In this case, on encountering a compressed size of 2, Archive::Zip reads two bytes off Foo-0.2.0/bar/baz.py's local file header (which are 'P' and 'K'; the first two bytes of any .ZIP entry's local file header signature). In such cases, Archive::Zip should emulate Info-ZIP's behaviour (which is not the case now). Info-ZIP's unzip possibly does a check to see if uncompressed size is 0, and if so, it does not honour compressed size. Will add that functionality to Archive::Zip soon. Zip-Parser's output for __init__.py has been attached. Furthermore, I was able to reproduce the problem in Gentoo Linux. It is most certainly a bug in Python's setuptools' .ZIP archive creation. This has been verified using Ark (KDE archiving tool). A screenshot has been attached with the anomaly highlighted in yellow. Regards, Alan Haggai Alavi.
Subject: ark.png
Download ark.png
image/png 66.7k
ark.png
Subject: zip_parser.out
Download zip_parser.out
application/octet-stream 1.4k

Message body not shown because it is not plain text.

CC: SMPETERS [...] cpan.org
Subject: Re: [rt.cpan.org #68446] Junk Added to Empty Files
Date: Thu, 13 Oct 2011 09:28:21 -0700
To: bug-Archive-Zip [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Oct 13, 2011, at 8:43 AM, Alan Haggai Alavi via RT wrote: Show quoted text
> Zip-Parser's output for __init__.py has been attached. Furthermore, I > was able to reproduce the problem in Gentoo Linux. It is most certainly > a bug in Python's setuptools' .ZIP archive creation. This has been > verified using Ark (KDE archiving tool). A screenshot has been attached > with the anomaly highlighted in yellow.
1. Can you report this to them? 2. Can you work around it (the way the zip command-line tool somehow does)? Thanks, David
Hello David, On Thu Oct 13 21:58:34 2011, DWHEELER wrote: Show quoted text
> 1. Can you report this to them?
Will do so. Show quoted text
> 2. Can you work around it (the way the zip command-line tool somehow
does)? The fix has been made. Latest code is available from: http://svn.ali.as/cpan/trunk/Archive-Zip/. Regards, Alan Haggai Alavi. -- The difference makes the difference
CC: SMPETERS [...] cpan.org
Subject: Re: [rt.cpan.org #68446] Junk Added to Empty Files
Date: Fri, 14 Oct 2011 09:02:50 -0700
To: bug-Archive-Zip [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Oct 14, 2011, at 7:53 AM, Alan Haggai Alavi via RT wrote: Show quoted text
> On Thu Oct 13 21:58:34 2011, DWHEELER wrote:
>> 1. Can you report this to them?
> > Will do so.
Alan++ Show quoted text
>> 2. Can you work around it (the way the zip command-line tool somehow
> does)? > > The fix has been made. Latest code is available from: > http://svn.ali.as/cpan/trunk/Archive-Zip/.
Oh, awesome, thanks! When are we likely to see a new release? Best, David
Hello David, On Fri Oct 14 21:33:05 2011, DWHEELER wrote: Show quoted text
> Alan++
:-) I have just reported the bug at The Distutils-Sig mailing list and have CC-ed the message to you. Show quoted text
> Oh, awesome, thanks! When are we likely to see a new release?
As Archive::Zip is released by Adam Kennedy, I am not sure when a new release will happen. He often releases soon. I have been trying to contact him but it seems like he is away. Regards, Alan Haggai Alavi. -- The difference makes the difference
CC: SMPETERS [...] cpan.org
Subject: Re: [rt.cpan.org #68446] Junk Added to Empty Files
Date: Mon, 17 Oct 2011 10:31:31 -0700
To: bug-Archive-Zip [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Oct 17, 2011, at 10:29 AM, Alan Haggai Alavi via RT wrote: Show quoted text
> I have just reported the bug at The Distutils-Sig mailing list and have > CC-ed the message to you.
I saw, thank you. Show quoted text
> As Archive::Zip is released by Adam Kennedy, I am not sure when a new > release will happen. He often releases soon. I have been trying to > contact him but it seems like he is away.
Great, thanks. I'll bug him, too, if I see him on IRC. D