I looked into the source for Archive::Tar.
It seems to me that FOLLOW_SYMLINK == 1 is fundamentally broken.
First, the documentation says:
---
Set this variable to C<1> to make C<Archive::Tar> effectively make a
copy of the file when extracting. Default is C<0>, which
means the symlink stays intact. Of course, you will have to pack the
file linked to as well.
This option is checked when you write out the tarfile using C<write>
or C<create_archive>.
This works just like C</bin/tar>'s C<-h> option.
---
This isn't terribly clear - "make a copy of the file when extracting"?
extracting means we're reading an archive - if all we have in the
archive is a symlink, there's no "copy of the file" to make.
This makes more sense as "write the data from the target of symlinks to
the archive, instead of the symlink itself."
For extract, FOLLOW_SYMLINK is never checked. This is correct.
Writing archives with FOLLOW_SYMLINK == 1 is broken.
In Tar.pm, line 1270 we decide to follow a symlink.
At line 1275, we downgrade_to_plainfile. But all that does is to
change the header's type, mode and linkname. That's fine when called
from _extract_special_file_as_plain_file.
But for write, nowhere do we actually follow the symlink and read the
data from the target! Nor is the header updated with the target's size
(because originally read as a symlink, was forced to zero), owner,
etc. So (at best) we write a header indicating zero size, without the
target's data.
I think that all we want from the symlink is the filename.
It also seems to me that the target of a symlink needn't be a file. It
could be a directory, a special file,... - or another symlink. We want
the data from the end of the chain, so we need to call a variant of
Archive::Tar::File::_new_from_file - a new stat is in order, WITHOUT
the special symlink checks. But WITH the special file checks and WITH
reading the file data.
This is rather involved. I'll leave the engineering to you.
Finally, here's a variant of the reproducer that generates a corrupt
tar file:
# echo "">foo
# ln -s foo bar
# perl -MArchive::Tar -e'$Archive::Tar::FOLLOW_SYMLINK=1; my
$t=Archive::Tar->new;$t->add_files("bar");$t->write
("baz.tgz",COMPRESS_GZIP) or print $t->error(1);'
# Note, no error reported, BUT
@ tar -tzf baz.tgz
bar
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
# ls -l foo bar baz.tgz
lrwxrwxrwx. 1 root root 3 Sep 15 19:35 bar -> foo
-rw-r--r--. 1 root root 93 Sep 17 07:32 baz.tgz
-rw-r--r--. 1 root root 1 Sep 17 07:32 foo
Thanks again.
On Thu Sep 15 19:53:38 2011, tlhackque wrote:
Show quoted text> Fedora 15, Perl 5.12
>
> Creating a tar file of a symlink with FOLLOW_SYMLINK fails
with "Could
Show quoted text> not write data".
>
> I expected it to de-reference the symlink and store the regular
file's
Show quoted text> data in the archive under the regular file's name.
>
> Reproducer:
>
> # touch foo
> # ln -s foo bar
>
> # perl -MArchive::Tar -e'$Archive::Tar::FOLLOW_SYMLINK = 1; my $t =
> Archive::Tar->new; $t->add_files("bar"); $t->write( "baz.tgz",
> COMPRESS_GZIP );print $t->error(1); print $Archive::Tar::VERSION'
> Could not write data for: bar at -e line 1
> Could not write data for: bar at /usr/share/perl5/Archive/Tar.pm line
> 1293
> Archive::Tar::write('Archive::Tar=HASH(0x9dcf90)', 'baz.tgz',
9)
Show quoted text> called at -e line 1
> 1.78
>
> The same command adding "foo" instead of bar works.
>
> Linux myhost 2.6.40-4.fc15.x86_64 #1 SMP Fri Jul 29 18:46:53 UTC 2011
> x86_64 x86_64 x86_64 GNU/Linux
>
> perl 5, version 12, subversion 4 (v5.12.4) built for x86_64-linux-
> thread-multi
>
> Archive::Tar 1.78
>
> Thanks in advance for your help.