Subject: | File::MimeInfo::Magic and range-length |
Date: | Thu, 02 Aug 2007 10:53:01 -0400 |
To: | bug-File-MimeInfo [...] rt.cpan.org |
From: | Chapman Flack <jflack [...] math.purdue.edu> |
The shared-mime-info-spec is not perfectly clear about the
semantics of 'range-length' in the 'magic' files, but looking
at the way update-mime-database.c computes it vs. the way
Magic.pm uses it, there is a slight discrepancy:
update-mime-database.c computes range-length from the
offset="start:end" attribute of the match element, and on
that the specification is clear: start and end give the
inclusive range of file offsets at which the candidate
match is allowed to begin, where an offset of n:n can
be abbreviated to simply n, a range that contains exactly
one acceptable offset.
update-mime-database.c computes range-length as
1 + start - end, that is, the number of contiguous
offsets to check. range-length is 1 in the usual
case of offset="n" or offset="n:n", and it is
omitted from the magic file if it is 1.
Magic.pm treats range-length as 0 if it is omitted
from the magic file--so the default is off by one--
and later treats the range-length as the maximum
length of a string to ignore between the start
offset and the match, so the use is also off by one.
The result is that the default case actually works
right, but in all non-default cases the match is
applied at one more position than it should be.
The problem can be confirmed by making a dummy file
with the first line:
#123456789abcdef?/bin/perl
and seeing that it matches application/x-perl,
even though the match rule requires /bin/perl
to start at offsets 1:16 inclusive and in the
dummy file it starts at offset 17.
The fix requires correcting both the default and the
later use. In _hash_magic, the line
my ($m, $w, $r) = ($1, $2, $3 || 0); # mask, word size, range
should be
my ($m, $w, $r) = ($1, $2, $3 || 1); # mask, word size, range
and then a new line
$r--;
should be added just before the line
my $end = $o + $l + $r;
Chapman Flack
mathematics
Purdue