Hi Steve, thanks for the quick and comprehensive reply. I've spent a
while trying your suggestions (comments below), but am unfortunately no
further forward.
At this point I should say that this is more of a nice to have than an
essential requirement, so if there are no quick-wins for either of us
then I will be happy for you to close the ticket. Have a look at the
below if you get the time anyway, and let me know what you think.
Show quoted text> Take a look at my comments on ticket 113516. Currently, when
> PDF::API2 opens a file, it reads the whole thing into memory, but
> that wasn't always the case, and the code that PDF::API2 is built on
> top of doesn't require that everything be loaded in memory either.
Thanks. I don't *think* this particular information helps, as I am
writing out, not reading.
Show quoted text> It's theoretically possible for you to create a number of pages,
> write those out to disk, free up the memory, and repeat, without
> closing and reopening the file. If you want to start down that
> trail, look at PDF::API2->finishobjects() and follow the path for
> details about writing out a file in chunks.
>
> Freeing the memory without closing the file may be trickier (I
> haven't looked into that yet). I'm guessing it'll involve the
> release_obj() call in PDF::API2::Basic::PDF::File -- if I'm reading
> the code correctly, that will remove it from the various caches,
> but without actually removing it from the PDF. The release() call
> will almost definitely free the memory, but I think that's only
> supposed to be called when you're done with the file.
I've spent a while playing around with the above. I seem to be able to
write out a PDF in chunks, but whenever I try to do so along with calls
to free the memory, I run into problems. The finishobjects() in itself
doesn't seem to make any difference to memory use, and whenever I try
it with something like a save or release_obj then I get:
Can't call method "new_obj" on an undefined value
at /usr/share/perl5/PDF/API2/Basic/PDF/Pages.pm line 92
Show quoted text> If you get to a point where you can call finishobjects() more than
> once and get a working file, but are still running out of memory,
> let me know (preferably with sample code) and we can dive into that
> problem more deeply.
I should have said before that I am using PDF::TextBlock. I don't think
this affects the principle though, as I run into similar problems if I
remove it and write lots of text using raw calls.
Anyway, FWIW, here is a MWE:
my $pdf = PDF::API2->new(-file => 'mypdf.pdf');
for my $count (1..100)
{ my $page = $pdf->page;
my $tb = PDF::TextBlock->new({
pdf => $pdf,
page => $page,
x => 100,
y => 100,
});
for my $count2 (1..20)
{
$tb->text("Text $count2");
$tb->apply;
}
$pdf->finishobjects;
}
$pdf->save;
Show quoted text> If that ends up being too complicated and you'd rather keep trying
> to speed up the ttfont calls, it should be possible to reuse the
> time-consuming part of that object's creation. It may be as simple
> as calling $new_pdf->{'pdf'}->new_obj($font_object_from_old_pdf)
> instead of $new_pdf->ttfont(...). That definitely wouldn't qualify
> as intended/supported behavior, but it might work.
Given the relatively modest potential gains, I've decided this is
probably best avoided!
Thanks again, and please do feel free to close this ticket if it all
looks like too much hassle.
Andy