Skip Menu |

This queue is for tickets about the Archive-Tar CPAN distribution.

Report information
The Basics
Id: 18720
Status: stalled
Priority: 0/
Queue: Archive-Tar

People
Owner: Nobody in particular
Requestors: yves [...] cpan.org
Cc: module-build-general [...] lists.sourceforge.net
perl5-porters [...] perl.org
AdminCc:

Bug Information
Severity: Critical
Broken in:
  • 0.99_01
  • 1.29
Fixed in: (no value)



CC: "List - Module-Build" <module-build-general [...] lists.sourceforge.net>, "Perl Porters" <perl5-porters [...] perl.org>,
Subject: [PATCH] Archive::Tar creates POSIX style tar files unnecessarily and by default, leading to compatibility problems particularly with WinZip.
Archive::Tar since around version .99 has been producing files with a POSIX flavour header by default. This behaviour is undesirable as it means that the files are not readable with many Win32 based compression tools, most notably WinZip. This is especially bad as the POSIX longfile name support is used even when it needn't be, that is when the stored file name is shorter than 100 bytes. The attached patch resolves this issue (it has been open since 2003 at least), by reverting to using the original single field naming scheme when it is possible, and then falling back to the Gnu Extend Header long name file support when necessary. This removes the older $DO_NO_USE_PREFIX flag and replaces it with a flag of the opposite meaning, and with a clearer name: $POSIX_LONGFILE. IMO this is desirable because: A) POSIX long filename support appears to be restricted to paths of a maximum of 156 chars and files of 100 chars. Older versions of A::T will produce mangled tar files when these restrictions are exceeded which is in violation of the POSIX standards which mandate that a fatal error be thrown. (GNU tar itself only warns and refuses to add the file) B) Gnu Extended Header longfile support does not have a file length restriction and is correctly handled by WinZip. C) Using the single field mechansim appears to be supported by everything so when the stored paths are short there is no need to confuse things with long file naming issues. Additional changes are that I did a bit of refactoring and code clean up of dead or unnecessary code. With the patch applied all tests pass, and WinZip can correctly read the files. Open issues unresolved by this patch: Archive::Tar is much less picky about what filenames it will encode into the archive than something like gnu tar is. For instance tar will remove leading directory components that are left of or are '..', likewise normally it will convert absolute filenames to relative by removing volume and root slashes. Archive::Tar does none of this. cheers, Yves
Subject: Archive-Tar-1.29_01.patch

Message body is not shown because it is too large.

On Fri Apr 14 12:27:43 2006, YVES wrote: Show quoted text
> The attached patch resolves this issue (it has been open since 2003 at > least), by reverting to using the original single field naming scheme > when it is possible, and then falling back to the Gnu Extend Header long > name file support when necessary. This removes the older > $DO_NO_USE_PREFIX flag and replaces it with a flag of the opposite > meaning, and with a clearer name: $POSIX_LONGFILE.
Hi Yves, thanks for the patch. Since it's a rather large and possibly far-reaching patch, I will have to take some extra time to review it. I'll get back to you as soon as possible -- Jos
On Fri Apr 14 12:27:43 2006, YVES wrote: Show quoted text
> Archive::Tar since around version .99 has been producing files with a > POSIX flavour header by default. This behaviour is undesirable as it > means that the files are not readable with many Win32 based compression > tools, most notably WinZip.
[...] Show quoted text
> This is especially bad as the POSIX longfile name support is used even > when it needn't be, that is when the stored file name is shorter than > 100 bytes.
This is a good point you raise, and seems like something worth fixing. However, since the patch you attached addresses many more issues than just this one, i'ts hard to decipher which parts would need to be applied. I'd be interested in a patch for just this particular issue. Show quoted text
> The attached patch resolves this issue (it has been open since 2003 at > least), by reverting to using the original single field naming scheme > when it is possible, and then falling back to the Gnu Extend Header long > name file support when necessary. This removes the older > $DO_NO_USE_PREFIX flag and replaces it with a flag of the opposite > meaning, and with a clearer name: $POSIX_LONGFILE.
I've done quite a bit of thinking about this, and have asked various others for their input in this, as well as testing the possible scenarios. In the end, I've decided that the patch as it is will not be applied, for the following reasons: * POSIX-tar is the standard tar on many unix filesystems, including Solaris, IRIX and AIX. Applying this change will break all A::T generated tar files for those platforms. This is not only a backwards incompatible change, but will cripple their use of tools such as CPAN.pm and CPANPLUS. * On Win32, installation tools such as CPAN.pm and CPANPLUS use Archive::Tar or cygwin's /bin/tar, which both support POSIX and GNU-tar file formats, so their use of the CPAN installers will not be hampered. * Archive::Tar already supports a way to change the POSIX-compliant archives to GNU-style archives. Users worried about these issues have the choice to do the right thing for their Win32 users, without breaking backward compatibility of Archive::Tar. Since this behaviour may not be 100% apparent to the novice user, I have added a FAQ entry about this particalur issue (See below) * Changing of the global variable C<$DO_NOT_USE_PREFIX> is also a backwards incompatible change, which, unless unavoidable, should not be done as it will break peoples existing code. Here's the docpatch i've applied to inform users of the issues involving the use of A::T, and the intricacies of the tar format itself: 1432,1438c1432,1443 < By default, C<Archive::Tar> will try to put paths that are over < 100 characters in the C<prefix> field of your tar header. However, < some older tar programs do not implement this spec. To retain < compatibility with these older versions, you can set the < C<$DO_NOT_USE_PREFIX> variable to a true value, and C<Archive::Tar> < will use an alternate way of dealing with paths over 100 characters < by using the C<GNU Extended Header> feature. --- Show quoted text
> By default, C<Archive::Tar> will try to put paths that are over > 100 characters in the C<prefix> field of your tar header, as > defined per POSIX-standard. However, some (older) tar programs > do not implement this spec. To retain compatibility with these older > or non-POSIX compliant versions, you can set the C<$DO_NOT_USE_PREFIX> > variable to a true value, and C<Archive::Tar> will use an alternate > way of dealing with paths over 100 characters by using the > C<GNU Extended Header> feature. > > Note that clients who do not support the C<GNU Extended Header> > feature will not be able to read these archives. Such clients include > tars on C<Solaris>, C<Irix> and C<AIX>.
1539a1545,1555 Show quoted text
> =item I'm using WinZip, or some other non-POSIX client, and files are not being extracted
properly! Show quoted text
> > By default, C<Archive::Tar> is in a completely POSIX-compatible > mode, which uses the POSIX-specification of C<tar> to store files. > For paths greather than 100 characters, this is done using the > C<POSIX header prefix>. Non-POSIX-compatible clients may not support > this part of the specification, and may only support the C<GNU Extended
> Header> functionality. To facilitate those clients, you can set the
> C<$Archive::Tar::DO_NOT_USE_PREFIX> variable to C<true>. See the > C<GLOBAL VARIABLES> section for details on this variable. >
1621a1638,1662 Show quoted text
> =head1 SEE ALSO > > =over 4 > > =item The GNU tar specification > > C<http://www.gnu.org/software/tar/manual/tar.html> > > =item The PAX format specication > > The specifcation which tar derives from; C< http://www.opengroup.org/onlinepubs/
007904975/utilities/pax.html> Show quoted text
> > =item A comparison of GNU and POSIX tar standards; C<http://www.delorie.com/gnu/
docs/tar/tar_114.html> Show quoted text
> > =item GNU tar intends to switch to POSIX compatibility > > GNU Tar authors have expressed their intention to become completely > POSIX-compatible; C<http://www.gnu.org/software/tar/manual/html_node/
Formats.html> Show quoted text
> > =item A Comparison between various tar implementations > > Lists known issues and incompatibilities; C<http://gd.tuwien.ac.at/utils/archivers/star/
README.otherbugs> Show quoted text
> > =back >