Skip Menu |

This queue is for tickets about the ExtUtils-MakeMaker CPAN distribution.

Report information
The Basics
Id: 53714
Status: rejected
Priority: 0/
Queue: ExtUtils-MakeMaker

People
Owner: Nobody in particular
Requestors: jjore [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: BSD tarballs unreadable by GNU tar 1.19 or less
Hi, Discovered this on Mac OS X 10.6.2. It uses BSD tar of libarchive-2.6.2 which adds in some headers that are allowed per POSIX but GNU tar 1.19 or earlier can't handle. At the moment, GNU tar 1.20+ isn't wide spread. EU-MM should avoid using BSD tar. Headers added: LIBARCHIVE.creationtime SCHILY.acl.access SCHILY.acl.default SCHILY.dev SCHILY.devmajor SCHILY.devminor SCHILY.fflags SCHILY.ino SCHILY.nlink SCHILY.nlinks SCHILY.realsize Suggested action: For the moment, use the executable bsdtar instead of just plain tar when on the darwin platform. There ought to be a more thorough check of this to handle other BSD but non-darwin.
On Fri Jan 15 00:25:14 2010, JJORE wrote: Show quoted text
> Discovered this on Mac OS X 10.6.2. It uses BSD tar of libarchive-2.6.2 > which adds in some headers that are allowed per POSIX but GNU tar 1.19 > or earlier can't handle. At the moment, GNU tar 1.20+ isn't wide spread. > EU-MM should avoid using BSD tar.
I can't reproduce this on OS X 10.6.2 with bsdtar 2.6.2 and gnutar 1.17. Is there something special about the files being archived? $ mkdir test $ cd test $ touch foo $ touch bar $ touch baz $ cd .. $ bsdtar -czvf test.tgz test a test a test/bar a test/baz a test/foo $ gnutar -tzvf test.tgz drwxrwxr-x schwern/schwern 0 2010-01-18 14:50 test/ -rw-rw-r-- schwern/schwern 0 2010-01-18 14:50 test/bar -rw-rw-r-- schwern/schwern 0 2010-01-18 14:50 test/baz -rw-rw-r-- schwern/schwern 0 2010-01-18 14:50 test/foo $ gnutar -xzvf test.tgz test/ test/bar test/baz test/foo $ gnutar --version tar (GNU tar) 1.17 Copyright (C) 2007 Free Software Foundation, Inc. License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Modified to support extended attributes. Written by John Gilmore and Jay Fenlason. $ bsdtar --version bsdtar 2.6.2 - libarchive 2.6.2
Found some info on it. http://old.nabble.com/bug-in-BSD-tar--td10844273.html * It would seem to only effect filenames with non 7-bit ASCII filenames, so the scope of this bug is rather restricted. * gnutar 1.17 can handle it, it just throws a warning $ gnutar -xzvf test.tgz test/ test/bar test/baz test/foo gnutar: Ignoring unknown extended header keyword `SCHILY.dev' gnutar: Ignoring unknown extended header keyword `SCHILY.ino' gnutar: Ignoring unknown extended header keyword `SCHILY.nlink' test/Føø Have you seen this problem in the wild?
On Mon Jan 18 18:01:54 2010, MSCHWERN wrote: Show quoted text
> Found some info on it. > http://old.nabble.com/bug-in-BSD-tar--td10844273.html > > * It would seem to only effect filenames with non 7-bit ASCII filenames, > so the scope of this bug is rather restricted. > > * gnutar 1.17 can handle it, it just throws a warning
No, standard GNU tar 1.19 and lower do not handle it. Andreas had to upgrade PAUSE servers recently from GNU tar 1.16 to 1.20 to be able to unpack uploaded tarballs. It throws the SCHILY and LIBARCHIVE warnngs and then exits non-zero. Perhaps your test used the Mac provided gnutar? I used the following snippet to find that apparently just my Snow Leopard. When I create a tarball, it's using libarchive 2.6.2 which when I check the source definitely adds the "bad" headers. When I build the GNU tars, nothing earlier than 1.20 can handle the BSD tarballs. $ find . \( -name '*.tar.gz' -o -name '*.tgz' \) -type f -print0 | xargs -0 zegrep '(SCHILY|LIBARCHIVE)' > bsdtar.txt 2>&1 ./authors/id/J/JJ/JJORE/B-Utils-0.10.tar.gz:Binary file (standard input) matches ./authors/id/J/JJ/JJORE/Term-HiliteDiff-0.09.tar.gz:Binary file (standard input) matches ./authors/id/J/JJ/JJORE/B-Lint-StrictOO-0.03.tar.gz:Binary file (standard input) matches
On Wed Jan 20 03:31:03 2010, JJORE wrote: Show quoted text
> On Mon Jan 18 18:01:54 2010, MSCHWERN wrote:
> > Found some info on it. > > http://old.nabble.com/bug-in-BSD-tar--td10844273.html > > > > * It would seem to only effect filenames with non 7-bit ASCII filenames, > > so the scope of this bug is rather restricted. > > > > * gnutar 1.17 can handle it, it just throws a warning
> > No, standard GNU tar 1.19 and lower do not handle it. Andreas had to > upgrade PAUSE servers recently from GNU tar 1.16 to 1.20 to be able to > unpack uploaded tarballs. It throws the SCHILY and LIBARCHIVE warnngs > and then exits non-zero. > > Perhaps your test used the Mac provided gnutar? I used the following > snippet to find that apparently just my Snow Leopard. When I create a > tarball, it's using libarchive 2.6.2 which when I check the source > definitely adds the "bad" headers. When I build the GNU tars, nothing > earlier than 1.20 can handle the BSD tarballs. > > $ find . \( -name '*.tar.gz' -o -name '*.tgz' \) -type f -print0 | xargs > -0 zegrep '(SCHILY|LIBARCHIVE)' > bsdtar.txt 2>&1 > > ./authors/id/J/JJ/JJORE/B-Utils-0.10.tar.gz:Binary file (standard input) > matches > ./authors/id/J/JJ/JJORE/Term-HiliteDiff-0.09.tar.gz:Binary file > (standard input) matches > ./authors/id/J/JJ/JJORE/B-Lint-StrictOO-0.03.tar.gz:Binary file > (standard input) matches
Aha! The ustar format seems to have some fixed width fields (http://en.wikipedia.org/wiki/Tar_%28file_format%29#Format_details). When a darwin user or group id is greater than 0777777 octal, /usr/bin/tar starts creating the extra headers. I reproduced both success and failure modes locally by creating a user with uid = 2**18 - 1 and uid = 2**18. I'd noticed separately while still chatting to #toolchain that using the -o flag as recommended in https://rt.cpan.org/Ticket/Display.html?id=53403 produced the following warnings. W3M211a:~ jbenjore$ tar -o -czf B-Utils-0.10.tar.gz B-Utils-0.10 tar: B-Utils-0.10/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/build/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/BUtils.h: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/BUtils_op.h: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/Changes: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/lib/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/Makefile.PL: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/MANIFEST: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/META.yml: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/OP.xs: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/ppport.h: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/README: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/typemap: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/Utils.xs: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/xt/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/xt/pod.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/xt/signature.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/10use.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/11export.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/20all_starts.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/21all_roots.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/22anon_subs.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/30parent.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/31oldname.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/32kids.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/33ancestors.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/34descendants.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/35siblings.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/36previous.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/37stringify.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/40walk.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/41walkfilt.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/42all.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/43allfilt.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/44optrep.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/50carp.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/t/utils/51croak.t: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/lib/B/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/lib/B/Utils/: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/lib/B/Utils.pm: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/lib/B/Utils/OP.pm: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large tar: B-Utils-0.10/build/IFiles.pm: Numeric user ID too large: Result too largeNumeric group ID too large: Result too large Further, there were no files in the created tarball. It seems that any file which provokes this kind of warning is just omitted from the tarball. W3M211a:~ jbenjore$ tar tzf B-Utils-0.10.tar.gz W3M211a:~ jbenjore$ i
On Mon Jan 18 18:01:54 2010, MSCHWERN wrote: Show quoted text
> Found some info on it. > http://old.nabble.com/bug-in-BSD-tar--td10844273.html > > * It would seem to only effect filenames with non 7-bit ASCII filenames, > so the scope of this bug is rather restricted. > > * gnutar 1.17 can handle it, it just throws a warning > > $ gnutar -xzvf test.tgz > test/ > test/bar > test/baz > test/foo > gnutar: Ignoring unknown extended header keyword `SCHILY.dev' > gnutar: Ignoring unknown extended header keyword `SCHILY.ino' > gnutar: Ignoring unknown extended header keyword `SCHILY.nlink' > test/Føø > > Have you seen this problem in the wild?
Yes. I uploaded several tarballs from the Macbook over Christmas and deleted the prior versions. I expected the newest versions to supercede and assumed backpan would be enough for anyone who wanted an older version. Unfortunately, the PAUSE indexer was using GNU tar 1.16 and it blew up when it tried to index the modules. I received no email initially about either success or failure in indexing the module. A month later, the #moose IRC channel summoned me to ask why B::Utils had disappeared from CPAN. I shrugged it off and asked PAUSE to reindex it. One week later, I noticed it still hadn't showed up. This is when I found out that the tarball was valid on my system but not others. I used my Macbook's local /usr/bin/gnutar to create B-Utils-0.11.tar.gz and that worked great. The summary is that I wandered into this tarpit entirely by accident. I might be one of the very, very few who have user ids that are numerically large. It's provisioned by Active Directory and I don't know why the number is what it is. It might just be auto-generated by a Microsoft tool.
Subject: Re: [rt.cpan.org #53714] BSD tarballs unreadable by GNU tar 1.19 or less
Date: Fri, 22 Jan 2010 13:11:54 -0800
To: bug-ExtUtils-MakeMaker [...] rt.cpan.org
From: Michael G Schwern <schwern [...] pobox.com>
BSD and GNU have a tiff and we're left cleaning up the mess. Yay. I've read some of the commentary on this on the BSD lists and their response has been "its in the standard, GNU tar is broken!" Good for the standard. Anyhow, my concern now is putting in GNU tar detection code. The simple patch would be to change MM_Darwin to use gnutar instead of tar. That assumes OS X has and will always have a program called gnutar. That only fixes OS X leaving all the BSDs still pumping out standard compliant but broken tarballs. The more general problem is detecting if the user has GNU tar, poking through the PATH to find gnutar or gtar and hoping its usable. In any other piece of software that would be simple. In MakeMaker it could break all sorts of things. I'm hesitant. -- I am somewhat preoccupied telling the laws of physics to shut up and sit down. -- Vaarsuvius, "Order of the Stick" http://www.giantitp.com/comics/oots0107.html
CC: jjore [...] cpan.org
Subject: Re: [rt.cpan.org #53714] BSD tarballs unreadable by GNU tar 1.19 or less
Date: Fri, 22 Jan 2010 14:43:42 -0800
To: bug-ExtUtils-MakeMaker [...] rt.cpan.org
From: Joshua ben Jore <twists [...] gmail.com>
On Fri, Jan 22, 2010 at 1:12 PM, Michael G Schwern via RT <bug-ExtUtils-MakeMaker@rt.cpan.org> wrote: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=53714 > > > BSD and GNU have a tiff and we're left cleaning up the mess.  Yay.  I've read > some of the commentary on this on the BSD lists and their response has been > "its in the standard, GNU tar is broken!"  Good for the standard. > > Anyhow, my concern now is putting in GNU tar detection code.  The simple patch > would be to change MM_Darwin to use gnutar instead of tar.  That assumes OS X > has and will always have a program called gnutar. > > That only fixes OS X leaving all the BSDs still pumping out standard compliant > but broken tarballs.  The more general problem is detecting if the user has > GNU tar, poking through the PATH to find gnutar or gtar and hoping its usable. > > In any other piece of software that would be simple.  In MakeMaker it could > break all sorts of things.  I'm hesitant.
What I've done in general is to ask each tar alternative if `$tar --help` =~ /bsdtar/ matches. If so, I don't use that tar unless I somehow rejected all the alternatives and then I use the first alternative anyway. MM_Darwin tries using tar, then gnutar, then bsdtar. MM_Any only uses tar. I am unsure that tar on non-unixy platforms exists and responds to --help as a command line parameter but that doesn't seem on its face to be much of a problem since we're already assuming `tar czf' is a valid command. I am however actually testing for bsdtar now so perhaps it might be possible that if any test at all for a tar would have failed that we will now begin failing. Further however, I am actually testing that Test::More::unlike( `tar --help`, qr/bsdtar/ ) so presumably if the platform had no tar I'd just get an undef value back and potentially a warning about the value. I wouldn't get a failure since Test::More::unlike( undef, qr/bsdtar/ ) succeeds.
Not something EUMM can reasonably check for.