Subject: | failing to open PDF where %%EOF isn't very near end |
perl is 5.8.8. running on Linux, but I don't think that's important
I recently got an error when opening a PDF
file, in File.pm:313
On investigation, it was caused by a PDF file
with a load of null padding on the end.
Initially I though this was invalid, but reading the
PDF 1.5 spec, I found
QUOTE
3.4.4, “File Trailer”
17. Acrobat viewers require only that the %%EOF marker appear somewhere
within the last 1024 bytes of the file.
(END QUOTE)
Going back to the File.pm code, I found this:
# $fh->seek($end - 1024, 0);
# $fh->read($buf, 1024);
$fh->seek($end - 64, 0);
$fh->read($buf, 64);
if ($buf !~ m/startxref$cr\s*([0-9]+)$cr\%\%eof.*?/oi)
{ die "Malformed PDF file $fname"; }
I'm guessing that for VERY small (hand crafted?) PDF files,
the attempted read of 1024 was failing, and so was changed to 64,
which "usually works".
I have made the following trivial change, which appears
to work in all cases, and invite your comment.
my $sz = $end < 1024 ? $end : 1024;
# $fh->seek($end - 1024, 0);
# $fh->read($buf, 1024);
$fh->seek($end - $sz, 0);
$fh->read($buf, $sz);
($end was set earlier, and is the offset to the
end of the file AKA file size...)
Here's a diff:
diff File.pm /usr/lib/perl5/site_perl/5.8.8/PDF/API2/Basic/PDF/File.pm
308d307
< my $sz = $end < 1024 ? $end : 1024;
311,312c310,311
< $fh->seek($end - $sz, 0);
< $fh->read($buf, $sz);
---
Show quoted text
> $fh->seek($end - 64, 0);
> $fh->read($buf, 64);
BugBear (who once wrote a commercial PostScript RIP at Hyphen)