It looks like I'm failing to handle PDF comments in several places.
This quick workaround would probably work. It replaces comments with
an equal number of spaces, so the change won't disrupt the internal
byte indexing of the document.
perl -i -pe's/^([ \t]*%[^\r\n]*)/ ' ' x length $1 /egms;' result.pdf
(but I have not tested it against a real PDF document)
In the longer term, I need to update my parser to allow comments.
Chris
On Sep 19, 2009, at 11:17 AM, noreply via RT wrote:
Show quoted text> Sat Sep 19 12:17:40 2009: Request 49839 was acted upon.
> Transaction: Ticket created by noreply
> Queue: CAM-PDF
> Subject: Unrecognized type in parseAny: errors with %PDF-1.3
> ReportLab
> Generated PDF document
> Broken in: 1.52
> Severity: Normal
> Owner: Nobody
> Requestors:
> Status: new
> Ticket <URL:
https://rt.cpan.org/Ticket/Display.html?id=49839 >
>
>
> result.pdf genereated by click the button at
http://www.xhtml2pdf.com/demo
>
>
> Then try
>
> $ readpdf -v result.pdf >2
> Unrecognized type in parseAny:
> 10 % ReportLab generated PDF document -- di...
>
> that is generated by comment after /ID
>
> $ diff -ruN result-orig.pdf result.pdf
> --- result-orig.pdf 2009-09-19 09:02:44.484375000 -0700
> +++ result.pdf 2009-09-19 09:02:52.859375000 -0700
> @@ -322,10 +322,7 @@
> 0000015178 00000 n
> 0000016930 00000 n
> trailer
> -<< /ID
> - % ReportLab generated PDF document -- digest (
http://www.reportlab.com
> )
> - [(1\251Bl% T\255\365\021\367O\201\314\2163) (1\251Bl%
> T\255\365\021\367O\201\314\2163)]
> -
> +<< /ID [(1\251Bl% T\255\365\021\367O\201\314\2163) (1\251Bl%
> T\255\365\021\367O\201\314\2163)]
> /Info 14 0 R
> /Root 13 0 R
> /Size 27 >>
>
>
> Now trying again
>
> $ readpdf -v result.pdf >2
> Unrecognized type in parseAny:
> 0 % Document Root^M
> << /Outlines 15 0 R^M
> /...
>
>
> The relevant part of the file I think is
>
> % 'R13': class PDFCatalog
> 13 0 obj
> % Document Root
> << /Outlines 15 0 R
> /PageMode /UseNone
> /Pages 24 0 R
> /Type /Catalog >>
> endobj
>
> I don't know how to fix that, my naive attempts result in errors like
>
> substr outside of string at ...lib/CAM/PDF.pm line 716.
> Use of uninitialized value $content[1] in join or string at
> ...lib/CAM/PDF.pm line 718.
> Expected object open tag
> 0 ref^M
> 0 27^M
> 0000000000 65535 f^M
> 000000011...
>
>
> Thanks
> Also, it would be nice to have a named destination example
>
> <result-orig.pdf><result.pdf>