Subject: | Problem with stream length from Text::PDF::TTFont |
Hi
The text below comes from a bug report I emailed to Martin some
time ago, but apparently I failed to log it in RT. I've also
raised a pull request with the same patch here:
https://github.com/silnrsi/text-pdf/pull/1
I have recently encountered a problem using Font::TTF via Text::PDF.
The problem I've seen is that a number of PDF readers (including
Ghostscript, Evince and Adobe Reader) complain that some files we
produce contain an invalid stream 'Length' value. This patch to the
newline handling is the change we put in place to work around the
problem:
=======================================================
--- lib/Text/PDF/Dict.pm 2006-03-17 22:39:17.000000000 +1300
+++ lib/Text/PDF/Dict.pm 2015-05-30 08:36:15.324799534 +1200
@@ -164,7 +164,7 @@
$pdf->out_obj($self->{'Length'}) if ($self->{'Length'}->is_obj($pdf));
}
}
- $fh->print("\n") unless ($str =~ m/$cr$/o);
+ $fh->print("\n");
$fh->print("endstream");
# $self->{'Length'}->outobjdeep($fh);
} elsif (defined $self->{' streamfile'})
=======================================================
The problem seems to occur when an object stream ends with a 0x0D byte
(and possibly a 0x0A byte also). In the case where the stream was
passed through the deflate filter, the bytes in the stream are for all
practical purposes random binary bytes and the CR and LF characters are
no more or less likely to occur than any other bytes. The 'endstream'
delimiter should appear at the start of a new line but in the case where
the final byte of the stream was 0x0D, the $cr regex is matched and so
the "\n" character is not being inserted.
My understanding of the PDF file format is very limited, so I'm not sure
if always adding the "\n" before "endstream" is likely to have any
adverse effects. It might possibly be better to always add the "\n" on
the end of the output from any filter that might produce binary bytes.
Or, the right answer might be to adjust the calculated 'Length' in the
case where the newline character was not added. We've been running with
the above code change for over 18 months and have not identified any
adverse effects.
I've attached a tar file that includes:
* the above patch
* a test script that uses Text::PDF to recreate the problem (on my system)
* a PDF file produced from that script
Regards
Grant McLean
Subject: | endstream-files.tar.gz |
Message body not shown because it is not plain text.