Skip Menu |

This queue is for tickets about the Archive-Tar CPAN distribution.

Report information
The Basics
Id: 70999
Status: new
Priority: 0/
Queue: Archive-Tar

People
Owner: Nobody in particular
Requestors: tlhackque [...] yahoo.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: (no value)
Fixed in: (no value)



Subject: Tar of FOLLOWed symlink fails
Fedora 15, Perl 5.12 Creating a tar file of a symlink with FOLLOW_SYMLINK fails with "Could not write data". I expected it to de-reference the symlink and store the regular file's data in the archive under the regular file's name. Reproducer: # touch foo # ln -s foo bar # perl -MArchive::Tar -e'$Archive::Tar::FOLLOW_SYMLINK = 1; my $t = Archive::Tar->new; $t->add_files("bar"); $t->write( "baz.tgz", COMPRESS_GZIP );print $t->error(1); print $Archive::Tar::VERSION' Could not write data for: bar at -e line 1 Could not write data for: bar at /usr/share/perl5/Archive/Tar.pm line 1293 Archive::Tar::write('Archive::Tar=HASH(0x9dcf90)', 'baz.tgz', 9) called at -e line 1 1.78 The same command adding "foo" instead of bar works. Linux myhost 2.6.40-4.fc15.x86_64 #1 SMP Fri Jul 29 18:46:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux perl 5, version 12, subversion 4 (v5.12.4) built for x86_64-linux- thread-multi Archive::Tar 1.78 Thanks in advance for your help.
Subject: Additional information: Tar of FOLLOWed symlink fails
From: tlhackque [...] yahoo.com
Additional information: This is a cut down reproducer from a real application. Note that foo does not exist, and so is a zero byte file. If one does an 'echo "abcd" >foo' instead of 'touch foo', so that the file is a non-zero length, the error is not issued. So this has something to do with FOLLOW_SYMLINK and a symlink target that is a zero byte file. Also, when it does fail, the output file is created by the reproducer. However, under circumstances that I haven't been able to narrow down, it is reported by tar as corrupt ("skipping header") with -tzf. In the real application, where several files are added, the write stops when it encounters the first symlink pointing to a zero-byte target; the rest of the files aren't written to the archive. A work-around is to open the file by hand, build an options hash from stat() and call add_data instead of add_file. I hope this clarifies the issue. On Thu Sep 15 19:53:38 2011, tlhackque wrote: Show quoted text
> Fedora 15, Perl 5.12 > > Creating a tar file of a symlink with FOLLOW_SYMLINK fails
with "Could Show quoted text
> not write data". > > I expected it to de-reference the symlink and store the regular
file's Show quoted text
> data in the archive under the regular file's name. > > Reproducer: > > # touch foo > # ln -s foo bar > > # perl -MArchive::Tar -e'$Archive::Tar::FOLLOW_SYMLINK = 1; my $t = > Archive::Tar->new; $t->add_files("bar"); $t->write( "baz.tgz", > COMPRESS_GZIP );print $t->error(1); print $Archive::Tar::VERSION' > Could not write data for: bar at -e line 1 > Could not write data for: bar at /usr/share/perl5/Archive/Tar.pm line > 1293 > Archive::Tar::write('Archive::Tar=HASH(0x9dcf90)', 'baz.tgz',
9) Show quoted text
> called at -e line 1 > 1.78 > > The same command adding "foo" instead of bar works. > > Linux myhost 2.6.40-4.fc15.x86_64 #1 SMP Fri Jul 29 18:46:53 UTC 2011 > x86_64 x86_64 x86_64 GNU/Linux > > perl 5, version 12, subversion 4 (v5.12.4) built for x86_64-linux- > thread-multi > > Archive::Tar 1.78 > > Thanks in advance for your help.
Subject: Analysis
From: tlhackque [...] yahoo.com
I looked into the source for Archive::Tar. It seems to me that FOLLOW_SYMLINK == 1 is fundamentally broken. First, the documentation says: --- Set this variable to C<1> to make C<Archive::Tar> effectively make a copy of the file when extracting. Default is C<0>, which means the symlink stays intact. Of course, you will have to pack the file linked to as well. This option is checked when you write out the tarfile using C<write> or C<create_archive>. This works just like C</bin/tar>'s C<-h> option. --- This isn't terribly clear - "make a copy of the file when extracting"? extracting means we're reading an archive - if all we have in the archive is a symlink, there's no "copy of the file" to make. This makes more sense as "write the data from the target of symlinks to the archive, instead of the symlink itself." For extract, FOLLOW_SYMLINK is never checked. This is correct. Writing archives with FOLLOW_SYMLINK == 1 is broken. In Tar.pm, line 1270 we decide to follow a symlink. At line 1275, we downgrade_to_plainfile. But all that does is to change the header's type, mode and linkname. That's fine when called from _extract_special_file_as_plain_file. But for write, nowhere do we actually follow the symlink and read the data from the target! Nor is the header updated with the target's size (because originally read as a symlink, was forced to zero), owner, etc. So (at best) we write a header indicating zero size, without the target's data. I think that all we want from the symlink is the filename. It also seems to me that the target of a symlink needn't be a file. It could be a directory, a special file,... - or another symlink. We want the data from the end of the chain, so we need to call a variant of Archive::Tar::File::_new_from_file - a new stat is in order, WITHOUT the special symlink checks. But WITH the special file checks and WITH reading the file data. This is rather involved. I'll leave the engineering to you. Finally, here's a variant of the reproducer that generates a corrupt tar file: # echo "">foo # ln -s foo bar # perl -MArchive::Tar -e'$Archive::Tar::FOLLOW_SYMLINK=1; my $t=Archive::Tar->new;$t->add_files("bar");$t->write ("baz.tgz",COMPRESS_GZIP) or print $t->error(1);' # Note, no error reported, BUT @ tar -tzf baz.tgz bar tar: Skipping to next header tar: Exiting with failure status due to previous errors # ls -l foo bar baz.tgz lrwxrwxrwx. 1 root root 3 Sep 15 19:35 bar -> foo -rw-r--r--. 1 root root 93 Sep 17 07:32 baz.tgz -rw-r--r--. 1 root root 1 Sep 17 07:32 foo Thanks again. On Thu Sep 15 19:53:38 2011, tlhackque wrote: Show quoted text
> Fedora 15, Perl 5.12 > > Creating a tar file of a symlink with FOLLOW_SYMLINK fails
with "Could Show quoted text
> not write data". > > I expected it to de-reference the symlink and store the regular
file's Show quoted text
> data in the archive under the regular file's name. > > Reproducer: > > # touch foo > # ln -s foo bar > > # perl -MArchive::Tar -e'$Archive::Tar::FOLLOW_SYMLINK = 1; my $t = > Archive::Tar->new; $t->add_files("bar"); $t->write( "baz.tgz", > COMPRESS_GZIP );print $t->error(1); print $Archive::Tar::VERSION' > Could not write data for: bar at -e line 1 > Could not write data for: bar at /usr/share/perl5/Archive/Tar.pm line > 1293 > Archive::Tar::write('Archive::Tar=HASH(0x9dcf90)', 'baz.tgz',
9) Show quoted text
> called at -e line 1 > 1.78 > > The same command adding "foo" instead of bar works. > > Linux myhost 2.6.40-4.fc15.x86_64 #1 SMP Fri Jul 29 18:46:53 UTC 2011 > x86_64 x86_64 x86_64 GNU/Linux > > perl 5, version 12, subversion 4 (v5.12.4) built for x86_64-linux- > thread-multi > > Archive::Tar 1.78 > > Thanks in advance for your help.