Skip Menu |

This queue is for tickets about the PDF-Burst CPAN distribution.

Report information
The Basics
Id: 46351
Status: resolved
Worked: 3 hours (180 min)
Priority: 0/
Queue: PDF-Burst

People
Owner: leocharre [...] cpan.org
Requestors: ahernit [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 1.16
Fixed in: 1.19



Subject: Wrong count of pages
Hello, here it counts wrong the number of pages if in the folder were already old burst pages from an older burst. e.g.: orgcp_page_0001.pdf orgcp_page_0002.pdf orgcp_page_0003.pdf orgcp_page_0004.pdf But the last splited pdf only had 2 pages.
Subject: Re: [rt.cpan.org #46351] Wrong count of pages
Date: Wed, 27 May 2009 09:03:21 -0400
To: bug-PDF-Burst [...] rt.cpan.org
From: leo charre <leocharre [...] gmail.com>
I need a little more information. First off, PDF::Burst does not count pages for you. There a various burst methods. Some of these have a page count type of call. For example.. a) CAM_PDF burst method, Asks a CAM::PDF object for a page count. b) PDF_API2 burst method asks for a page count, from the PDF::API2 object. c) pdftk burst method reads the directory for files that look like they came from your original doc in question. I suspect the problem is that you're using method c).. that is.. you are calling PDF::Burst::pdf_burst_pdftk() either directly or via setting the $PDF::Bust::BURST_METHOD to 'pdftk'. pdftk does not tell you how many pages there are inside a pdf when you call a burst method. We could manually open the doc_data.txt file generated by pdftk, which lists how many pages there are in the document- and only return that many listed. That is.. if the doc_data.txt says there are 2 pages, we only return orgcp_page_0001.pdf orgcp_page_0002.pdf. HACK: However- this would be a HACK. I suggest you are doing something wrong on your end. Why are you expanding more than one file with the same name into the same directory? It's like saying cp doc1.pdf /tmp/doc.pdf pdfburst /tmp/doc.pdf cp doc2.pdf /tmp/doc.pdf pdfburst /tmp/doc.pdf Don't do that. Clean up after yourself. Delete the files, call them something else, set up a temporary staging area, or provide a randomly generated groupname.. see: http://search.cpan.org/~leocharre/PDF-Burst-1.17/lib/PDF/Burst.pod#pdf_burst() "...ged. Optional arguments are the 'groupname', and the abs location (dir) you want to output the files t..." Am I understanding the problem correctly? Does this help? If not, I want to consider implementing the HACK stated above. Let me know. (Thank you for notifying me of the problem you are having, I'll do whatever I can to make this work properly.) On 5/25/09, Andreas Hernitscheck via RT <bug-PDF-Burst@rt.cpan.org> wrote: Show quoted text
> Mon May 25 09:59:25 2009: Request 46351 was acted upon. > Transaction: Ticket created by AHERNIT > Queue: PDF-Burst > Subject: Wrong count of pages > Broken in: 1.16 > Severity: Critical > Owner: Nobody > Requestors: ahernit@cpan.org > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=46351 > > > > Hello, > > here it counts wrong the number of pages if in the folder were already > old burst pages from an older burst. > > e.g.: > > orgcp_page_0001.pdf > orgcp_page_0002.pdf > orgcp_page_0003.pdf > orgcp_page_0004.pdf > > But the last splited pdf only had 2 pages. >
-- Leo Charre
Hi, yes its true, I am using pdftk. And it it exactly caused by the reason you describe. But IMHO your lib should return a correct value independent from my script, because by the simple reason you expect it from this method. I split several PDFs in a loop, that is the reason for this effect. Of course, now I clean up the folder before, but like I said, this script should return a valid value. So far consider to add this info to your doc. maybe doing a trick to copy the file to a new name with a random number would do it. Or something by comparing file attributes?
Subject: Re: [rt.cpan.org #46351] Wrong count of pages
Date: Wed, 27 May 2009 14:43:44 -0400
To: bug-PDF-Burst [...] rt.cpan.org
From: leo charre <leocharre [...] gmail.com>
I want to make sure to clear up what pdf_burst() does, it returns an array of abs paths to files that each represent a page of your original document. It does not return a value that represents how many pages were burst from your original pdfs. That said, the error happens under these conditions: If you have PDF::Burst configured to burst with pdftk. You call burst for /tmp/doc1.pdf with 20 pages, and then overrite /tmp/doc1.pdf with another document that has 10 pages, and call pdf burst again, you get 20 pages in the array. I think it's horrid practice to do this in the first place, why are you bursting a pdf in to documents and not doing something *with* the files, or deleting them afterwards? or giving them another 'group name'? Beats me. *That* said, yes, this is a bug. A fix has been introduced. The fix will attempt to read the doc_data.txt file created by pdftk (inot the cwd, which is altered and then set back) when you burst. It should contain a page number. If a page number exists, we compare and warn accordingly- furthermore if we get extra pages, we prune the end to match. This should fix the op concern. The new release is PDF-Burst-1.19 http://cpansearch.perl.org/src/LEOCHARRE/PDF-Burst-1.19/Changes It should be appearing on cpan after 2009/05/27 3pm eastern time zone. On 5/27/09, Andreas Hernitscheck via RT <bug-PDF-Burst@rt.cpan.org> wrote: Show quoted text
> Queue: PDF-Burst > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=46351 > > > > Hi, > > yes its true, I am using pdftk. > > And it it exactly caused by the reason you describe. > > But IMHO your lib should return a correct value independent from my > script, because by the simple reason you expect it from this method. > > I split several PDFs in a loop, that is the reason for this effect. Of > course, now I clean up the folder before, but like I said, this script > should return a valid value. So far consider to add this info to your doc. > > maybe doing a trick to copy the file to a new name with a random number > would do it. Or something by comparing file attributes? > > >
-- Leo Charre
Resolved in version PDF::Burst 1.19