Subject: | Clobbering deliberately zero'd timestamps, etc. and deterministic archives |
Date: | Thu, 11 Jan 2018 15:44:32 -0700 |
To: | bug-Archive-Ar [...] rt.cpan.org |
From: | Philip Prindeville <philipp [...] cpan.org> |
Hi.
I had a been trying to use Archive::Ar to fake the behavior of “ar D” on really old binutils toolchains which didn’t support it (basically, ‘ar D’ tells ‘ar’ to build “deterministic” archives by setting the uid, guid, and date all to the empty string, and the mode to ‘644’). Per the ar(1) man page:
D Operate in deterministic mode. When adding files and the archive
index use zero for UIDs, GIDs, timestamps, and use consistent file
modes for all files. When this option is used, if ar is used with
identical options and identical input files, multiple runs will
create identical output files regardless of the input files'
owners, groups, file modes, or modification times.
I tried to write a trivial program to run through a library and fix it up, but turns out I wasn’t able to, mostly because of this one line:
224 $params->{date} ||= timelocal(localtime());
which I fixed by changing it to:
224 $params->{date} = timelocal(localtime()) unless (exists $params->{date});
so that’s my suggested fix.
Some opportunistic comments/suggestions/improvements:
The mangling of the filename and overwriting what’s in the hash:
217 (undef, undef, $filename) = File::Spec->splitpath($filename);
218
219 $params->{name} = $filename;
seems like overkill. Especially since the ar spec requires that the filename be a basename only with slash appended to it, e.g. “myfile.o/“. If you pass this in, splitpath() will return “” as the filename. Safer to do:
217 (undef, undef, $params->{name}) = File::Spec->splitpath($filename)
218 unless (exists $params->{name});
or otherwise:
217 (undef, undef, $filename) = File::Spec->splitpath($filename);
218
219 $params->{name} = $filename . ‘/‘;
per the Wiki article on AR: https://en.wikipedia.org/wiki/Ar_(Unix)
System V ar uses a '/' character (0x2F) to mark the end of the filename; this allows for the use of spaces without the use of an extended filename.
Not sure about this line:
226 $params->{mode} ||= "100644”;
as “644” is the canonical value in binutils/bfd/archive.c. I think “100644” is a Debian-ism.
And the pod here:
560 =item * C<add_data(I<"filename">, I<$filedata>)>
561
562 Takes an filename and a set of data to represent it. Unlike C<add_files>, C<add_data>
563 is a virtual add, and does not require data on disk to be present. The
564 data is a hash that looks like:
is incorrect, as it omits the 2nd argument, $data:
205 sub add_data
206 {
207 my($this, $filename, $data, $params) = @_;
and as a general comment, it’s an arbitrary restriction that you can’t add_data() over an existing file. Sometimes you want to rewrite a file in place, preserving its order in the archive… otherwise you have to remove it AND ALL THE FILES WHICH FOLLOW IT and add then back in the original order. That’s a little tedious.
Looking at:
390 if(exists($this->{_filehash}->{$file->{name}}))
391 {
392 $this->_dowarn("Can't _addFile because virtual file already exists with that name in the archive");
393 return;
394 }
395
396 push @{$this->{_files}}, $file->{name};
it’s not clear to me why this can’t just be:
390 push @{$this->{_files}}, $file->{name} unless (exists($this->{_filehash}->{$file->{name}}));
instead.
Attaching diffs for the patched version I’m using.
Message body is not shown because sender requested not to inline it.