Skip Menu |

This queue is for tickets about the PDF-API2 CPAN distribution.

Report information
The Basics
Id: 131223
Status: stalled
Priority: 0/
Queue: PDF-API2

People
Owner: Nobody in particular
Requestors: welleozean [...] googlemail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: corrupted PDF generated
Date: Mon, 23 Dec 2019 20:23:27 +0100
To: bug-PDF-API2 [...] rt.cpan.org
From: welle ozean <welleozean [...] googlemail.com>
On Windows 10 running the latest PDF::API2 generates corrupted files: use strict; use warnings; use PDF::API2; use PDF::API2::Annotation; use PDF::API2::Basic::PDF::Utils; my $pdf = PDF::API2->open('C:\\Users\\WC\\Desktop\\original.pdf'); my $page = $pdf->openpage(1); my $sticky = $page-> annotation; $sticky-> text( 'Text in pop-up window', -rect => [ 100, 500, 100, 500 ], -open => 1 ); $sticky-> { C } = PDFArray( map PDFNum( $_ ), 1, 0.65, 0 ); $pdf->saveas( 'C:\\Users\\WC\\Desktop\\target.pdf' ); For what it matters, also simply opening the file and saveas without any operation in between generates a corrupted file. With corrupt I mean the latest Adobe reader is not able to open it (Error 14)
I just tried your code example, and it worked fine for me. The only change was to switch original.pdf to a local known-good PDF that I had lying around. By current PDF::API2, do you mean 2.036? Your original.pdf is known to be good (load into reader with no error messages, no offer to save it when quitting the reader)? I'm using Adobe Acrobat Reader DC (I think it lives in the Cloud) 19.021.20061, which I just updated yesterday, on Windows 10. Anyway, do you still get this corruption with a variety of other PDFs?
Subject: Re: [rt.cpan.org #131223] corrupted PDF generated
Date: Tue, 24 Dec 2019 15:42:45 +0100
To: bug-PDF-API2 [...] rt.cpan.org
From: welle ozean <welleozean [...] googlemail.com>
This are my spec: Windows 10 Perl 5.28.1 PDF::API2 2.036 Adobe Acrobad Reader DC 19.021.20061 All my PDF can be easily opened in Adobe with no error message. I extended my tests. All my files have been edited, probably using FoxyReader. All the files present the same issue after running my script (the original file, as said, can be opened with no issue). Other files downloaded from the Web for test reasons can be opened fine also after running the script. At this link, you can find a file tha fails: https://filebin.net/2rp3p3xua17twwe1/making_sense_of_NMT.pdf?t=ureuhq16 Am Di., 24. Dez. 2019 um 01:34 Uhr schrieb Phil M. Perry via RT < bug-PDF-API2@rt.cpan.org>: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=131223 > > > I just tried your code example, and it worked fine for me. The only change > was to switch original.pdf to a local known-good PDF that I had lying > around. By current PDF::API2, do you mean 2.036? Your original.pdf is known > to be good (load into reader with no error messages, no offer to save it > when quitting the reader)? I'm using Adobe Acrobat Reader DC (I think it > lives in the Cloud) 19.021.20061, which I just updated yesterday, on > Windows 10. > > Anyway, do you still get this corruption with a variety of other PDFs? >
Two problems: 1. Your PDF is version 1.5, which is likely to cause problems with PDF::API2. It may have structures or data that PDF::API2 has no idea how to handle. 2. It starts at page 291 and runs to 309 (19 pages). I can't get to any page before 291. It /looks/ like a complete article, but I've never seen this kind of behavior before. I tried the same code and PDF file with PDF::Builder, and it seems to work (didn't blow up, at least). PDF::Builder is a /little/ more forgiving of post-1.4 items, but not knowing what PDF::API2 is choking on, I can't guarantee that PDF::Builder is working properly. Anyway, you might want to try PDF::Builder (it can be installed alongside PDF::API2) and see if it works for you.
Subject: Re: [rt.cpan.org #131223] corrupted PDF generated
Date: Fri, 3 Jan 2020 14:18:33 +0100
To: bug-PDF-API2 [...] rt.cpan.org
From: welle ozean <welleozean [...] googlemail.com>
Thank you for your feedback. I was able to annotate the same PDF with PDF::Builder, so for this task on similar PDFs, I will use the suggested module. Am Di., 24. Dez. 2019 um 18:40 Uhr schrieb Phil M. Perry via RT < bug-PDF-API2@rt.cpan.org>: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=131223 > > > Two problems: > > 1. Your PDF is version 1.5, which is likely to cause problems with > PDF::API2. It may have structures or data that PDF::API2 has no idea how to > handle. > > 2. It starts at page 291 and runs to 309 (19 pages). I can't get to any > page before 291. It /looks/ like a complete article, but I've never seen > this kind of behavior before. > > I tried the same code and PDF file with PDF::Builder, and it seems to work > (didn't blow up, at least). PDF::Builder is a /little/ more forgiving of > post-1.4 items, but not knowing what PDF::API2 is choking on, I can't > guarantee that PDF::Builder is working properly. Anyway, you might want to > try PDF::Builder (it can be installed alongside PDF::API2) and see if it > works for you. >
It's good to hear that you have a way forward to do your work. It still would be nice to figure out what's going wrong with PDF::API2 so it could be fixed. Something I didn't mention before is that PDF::Builder also had extensive rewrites of the Annotation functionality, so it's possible that the difference is in the Annotation code rather than in PDF 1.5+ handling.
Not having a test case (the filebin link no longer works), I'm going to guess from your description that the original PDF has a cross-reference stream in it. PDF::API2 can read those as of 2.026, but can't yet write them. See RT #117184. You can work around the issue by creating a new PDF and importing the pages from the original file into the new one.