Skip Menu |

This queue is for tickets about the CAM-PDF CPAN distribution.

Report information
The Basics
Id: 86863
Status: open
Priority: 0/
Queue: CAM-PDF

People
Owner: Nobody in particular
Requestors: p-fbsd-bugs [...] ziemba.us
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: CAM::PDF 1.58 "Expected object open tag"
Date: Thu, 11 Jul 2013 12:09:54 -0700
To: bug-CAM-PDF [...] rt.cpan.org
From: "G. Paul Ziemba" <p-fbsd-bugs [...] ziemba.us>
Thanks in advance for any suggestions: code: use CAM::PDF; # 1.58 $pdf = CAM::PDF->new($fn, undef, undef, {fault_tolerant => 1}); output: Expected object open tag 0 ^M xref^M2 15 ^M0000000016 00000 n^M 0000000... PDF file is attached. It was produced on Win2K using ABBYY FineReader (see meta-info in file). I'm running CAM::PDF 1.58 on FreeBSD 9.X. Acroread 8.1.7 and xpdf 3.03 display the file OK. For what it's worth, I tried fiddling with PDF.pm per item 3 of bug 78727 and found that if I added two blocks of testing for \n or \r, I stopped getting the "Expected object open tag" error message, but that the resulting value from $pdf->getPageText text had extra spaces embedded. I'm clearly missing something... sub _buildxref { . . . my $trailer; # BEGIN gpz - added per https://rt.cpan.org/Public/Bug/Display.html?id=78727 my $stuff=substr $self->{content}, $startxref, 1; if($stuff eq "\n" || $stuff eq "\r") { $startxref++; print __LINE__," $self->{pdfversion} - position was off a little\n"; } $stuff=substr $self->{content}, $startxref, 1; if($stuff eq "\n" || $stuff eq "\r") { $startxref++; print __LINE__," $self->{pdfversion} - position was off a little\n"; } # END - added per https://rt.cpan.org/Public/Bug/Display.html?id=78727 -- G. Paul Ziemba FreeBSD unix: 11:51AM up 85 days, 23:09, 1 user, load averages: 1.00, 0.81, 0.81
Download 2013_07_11_11_29_25_OCR.pdf
application/pdf 30.5k

Message body not shown because it is not plain text.

RT-Send-CC: cdolan [...] cpan.org
On Thu Jul 11 15:10:12 2013, p-fbsd-bugs@ziemba.us wrote: Show quoted text
> Thanks in advance for any suggestions: > > code: > use CAM::PDF; # 1.58 > $pdf = CAM::PDF->new($fn, undef, undef, {fault_tolerant => 1}); > > output: > Expected object open tag > 0 ^M > xref^M2 15 ^M0000000016 00000 n^M > 0000000... > > PDF file is attached. It was produced on Win2K using ABBYY FineReader > (see meta-info in file). I'm running CAM::PDF 1.58 on FreeBSD 9.X. > Acroread 8.1.7 and xpdf 3.03 display the file OK. > > For what it's worth, I tried fiddling with PDF.pm per item 3 of bug > 78727 and found that if I added two blocks of testing for \n or \r, > I stopped getting the "Expected object open tag" error message, but > that the resulting value from $pdf->getPageText text had extra spaces > embedded. I'm clearly missing something... > > sub _buildxref > { > . > . > . > my $trailer; > # BEGIN gpz - added per > https://rt.cpan.org/Public/Bug/Display.html?id=78727 > my $stuff=substr $self->{content}, $startxref, 1; > if($stuff eq "\n" || $stuff eq "\r") > { > $startxref++; > print __LINE__," $self->{pdfversion} - position was off a > little\n"; > } > $stuff=substr $self->{content}, $startxref, 1; > if($stuff eq "\n" || $stuff eq "\r") > { > $startxref++; > print __LINE__," $self->{pdfversion} - position was off a > little\n"; > } > # END - added per > https://rt.cpan.org/Public/Bug/Display.html?id=78727
I don’t know why that did not work for you, but the attached simpler patch did work for me. I am also using ABBYY FineReader.
Subject: open_M4pDttYw.txt
--- /Library/Perl/5.12/CAM/PDF.pm 2017-02-09 18:01:04.000000000 -0800 +++ CAM/PDF.pm 2017-02-21 07:51:25.000000000 -0800 @@ -572,9 +572,10 @@ my $objstreamrefs = shift; my $trailer; - if ('xref' eq substr $self->{content}, $startxref, 4) + if (substr $self->{content}, $startxref, 50, =~ /^(\s*)xref/) { - $trailer = $self->_buildxref_pdf14($startxref, $index, $versions); + $trailer = $self->_buildxref_pdf14($startxref + length $1, $index, + $versions); if ($trailer && exists $trailer->{XRefStm}) { if (!$self->_buildxref_pdf15($trailer->{XRefStm}->{value}, $index, $versions, $objstreamrefs))
On Thu Jul 11 15:10:12 2013, p-fbsd-bugs@ziemba.us wrote: Show quoted text
> Thanks in advance for any suggestions: > > code: > use CAM::PDF; # 1.58 > $pdf = CAM::PDF->new($fn, undef, undef, {fault_tolerant => 1}); > > output: > Expected object open tag > 0 ^M > xref^M2 15 ^M0000000016 00000 n^M > 0000000... > > PDF file is attached. It was produced on Win2K using ABBYY FineReader > (see meta-info in file). I'm running CAM::PDF 1.58 on FreeBSD 9.X. > Acroread 8.1.7 and xpdf 3.03 display the file OK.
At the risk of self-promotion, I will point out that reading and dumping the PDF file with PDF::Tiny makes it compatible with CAM::PDF: $ perl -MPDF::Tiny -e 'PDF::Tiny->new("old.pdf")->print(filename => "new.pdf")