Skip Menu |

This queue is for tickets about the CAM-PDF CPAN distribution.

Report information
The Basics
Id: 69021
Status: resolved
Priority: 0/
Queue: CAM-PDF

People
Owner: Nobody in particular
Requestors: david [...] audacitas.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.54
Fixed in: (no value)



Subject: save() very slow for large files with many changes
CAM::PDF version 1.54 Perl 5.10.1 Ubuntu 10.04 LTS I have a script which makes multiple additions to (guessing) every third page in a document with nearly 700 pages, totalling 17MB on disk. Sorry, I can't provide a test file as this is confidential customer information. Making the changes was quick (as I've come to expect from CAM::PDF) but saving the document using preserveOrder() and cleanoutput() puzzlingly took over two minutes. Analysis with NYTProf revealed that most of the time was spent on line 5003 in PDF.pm. $newxref{$key} = length $self->{content}; In my case, that was 139 seconds. This gets executed each time a changed object is written out. I have no idea about Perl internals but it seems to me that something odd is lurking behind length(). Either it incurs a large-ish static penalty or, even more oddly, takes longer to find the length of longer strings. I patched the loop to keep a separate offset counter and only use length() to find the size of the new objects. That works fine here; length() now only consumes 3.51ms and the complete save() 1.83s, which is just fine, so I didn't make any further attempts at optimisation. I've attached my patch; hope this helps. BTW: Thanks for CAM::PDF. I've tried Text::PDF, PDF::API2, PDF::Reuse and even a demo version of PDFlib, but I've settled on this module because it represents a good combination of speed, ability to get the job done, and openness. It hasn't let me down yet and it looks like it's being actively supported, too. Well done. -- David
Subject: CAM-PDF_save_patch.diff
4999a5000 > my $offset = length $self->{content}; 5003c5004 < $newxref{$key} = length $self->{content}; --- > $newxref{$key} = $offset; 5006c5007,5009 < $self->{content} .= $self->writeObject($key); --- > my $obj = $self->writeObject($key); > $self->{content} .= $obj; > $offset += length $obj;
Thanks! Patch applied and uploaded to CPAN via CAM::PDF 1.55