Skip Menu |

This queue is for tickets about the Text-PDF CPAN distribution.

Report information
The Basics
Id: 120400
Status: new
Priority: 0/
Queue: Text-PDF

People
Owner: Nobody in particular
Requestors: 'spro^^*%*^6ut# [...] &$%*c
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Can’t handle newlines in references
In PDF syntax, an indirect reference consists of three distinct tokens that can be separated by any PDF whitespace, and even comments. For example, this is a syntactically valid indirect reference: 1 %eieio 0 R Text::PDF does not allow comments at all (based on reading the code; that is not a problem for my PDFs). But it does choke on newlines if the object is long enough that it has not all been read into the file yet. This happens with: 1895 0 obj<</Count 253/Kids[1896 0 R 1 0 R 7 0 R 13 0 R ... etc., with 253 entries. Text::PDF::File::readval needs to read more data if it finds what could be a partial reference.
Subject: open_7avjz48f.txt
--- /Users/sprout/.cpan/build/Text-PDF-0.31-rH_fyS/lib/Text/PDF/File.pm 2016-08-16 08:01:48.000000000 -0700 +++ lib/Text/PDF/File.pm 2017-02-26 14:54:42.000000000 -0800 @@ -1080,10 +1080,10 @@ { $xlist->{$xmin++} = [$1, $2, $3]; } } - if ($buf !~ /^trailer$cr/oi) + if ($buf !~ /^trailer$ws_char*/oi) { die "Malformed trailer in PDF file $self->{' fname'} at " . ($fh->tell - length($buf)); } - $buf =~ s/^trailer$cr//oi; + $buf =~ s/^trailer$ws_char*//oi; ($tdict, $buf) = $self->readval($buf); $tdict->{' loc'} = $xpos;