Your results are strange, because patches work for me, both for PDF::API2 and PDF::Buider. You either did something wrong (then check again, please) or, perhaps, you are working with some internal modified version.
As to questions:
1) The issue with "$result->{' outto'} = [ $self ];" was described in the OP. File is either created anew or read from disk. In former case, the ' outto' property is initialized i.e. assigned. In latter case, it is not (maybe, rather, it is what's to be fixed?). All nodes in page tree seem to require this property to output themselves.
2) The "$result->{' realised'} = 0;" is useless because in all conditionals where this key is tested it is not compared to zero, but simply checked for truth. So it's OK to be non-existent. If you don't like it, then lift that line a few lines higher into "unless" block (as original author did in "copy" subroutine, please find).
What I think is going on:
Suppose some object is referenced in 2 different parts of PDF, e.g. as "6 0 R". When 1st fragment is parsed, Objind is created, but object #6 itself is not "realized" i.e. found through xref table and parsed from source. Then my program does something (call it X) that _does_ require this #6 to be "realized". OK, and it was. Later, for any reason, 2nd fragment of "6 0 R" was parsed. New Objind wasn't created, but the ' realized' flag was unset!!! Oops. Now, what if between realizing #6 and reading 2nd fragment I made any changes to #6's representation in memory? Very well. But now I do something similar (or not similar, doesn't matter) to (X) which resulted in checking if the #6 was "realized" and its subsequent "realization". Remember, flag has been unset, old version is _now_ parsed from source again, previous changes are lost.
3) The "sub realize" -- I think you misread it, the invocant of "read_obj" is not $self, so nothing strange is happening. Explicit return is right thing to do, but, though not your fault, it can lead to ugly and impossible-to-be-correct code, e.g. subroutine "val" in same module. Looks suspicious in original, too. (As you see, I can't help but digress.)
4) Pages.pm -- About this one I was absolutely wrong, sorry I have mislead anyone, my patch was invalid, new patch is attached. Thank you that you made me try to understand that code yet again.
Two separate lines of code in question (which were both modified in invalid patch) look deceptively similar -- the same! But they are very different. As you see, 1st one is now left alone, because it does as described in POD: if position to insert new page is too high, then insert it last.
OTOH, in 2nd fragment (the only present in suggested new patch) the loop directly above is guaranteed to trigger its "last", and it is only reachable if (1) we are adding to intermediate (not root) node, (2) it's being over-filled (so, new container is added to parent), (3) new container is, strictly, either prepended ($index == 0) or appended ($index == -1). So, the only thing I need to do here is to check $index and increase position to append (or otherwise it would be "prepending").
5) As to issues with circular refs -- can't say anything for now. My "in-production" long-running script (though it uses modified CAM::PDF, I suspect it also leaks) starts asynchronously a new perl process to process each PDF file because I gave up on tracking memory, so it's reclaimed by OS, and with just thousands PDF files per day this setup OK.
6) Issue with LZWDecode/FlateDecode I'd rather revoke, its probability in real world being almost zero.
Can't agree on "curses on authors' :)
--- PDF\API2\Basic\PDF\Pages.old Fri Jul 7 04:53:59 2017
+++ PDF\API2\Basic\PDF\Pages.pm Sat Mar 30 01:14:48 2019
@@ -216,7 +216,7 @@
$ppnum = scalar $ppages->{'Kids'}->realise->elementsof;
for ($pindex = 0; $pindex < $ppnum; $pindex++)
{ last if ($ppages->{'Kids'}{' val'}[$pindex] eq $self); }
- $pindex = -1 if ($pindex == $ppnum);
+ $pindex ++ if $index == -1;
$ppages->add_page_recurse($newpages, $pindex);
}
}