Subject: | recursive buffer snafu |
Date: | Thu, 6 Mar 2014 11:35:47 +0000 |
To: | bug-Sereal-Encoder [...] rt.cpan.org |
From: | Zefram <zefram [...] fysh.org> |
Sereal::Encoder->encode isn't reentrant, for calls on a single encoder
object. This can be seen by recursively encoding using the same encoder
object from a FREEZE callback:
use Sereal::Encoder;
use Sereal::Decoder;
my $enc = Sereal::Encoder->new({freeze_callbacks=>1});
package Foo {
sub FREEZE { $enc->encode({%{$_[0]}}) }
sub THAW { bless(Sereal::Decoder->new->decode($_[2]), $_[0]) }
}
my $a = $enc->encode(bless({a=>1},"Foo"));
my $b = Sereal::Decoder->new->decode($a);
(Never mind that this is a somewhat silly thing to do in FREEZE
because it's allowed to return a data structure that will be implicitly
Serealised. This is just an example.) With the encoder thus shared,
the encoded data is corrupted, and decoding signals an error. If the
two ->encode calls use separate encoder objects, by duplicating the
"my $enc" line just above "my $a", it all works.
The underlying bug is that the srl_encoder_t structure doesn't distinguish
between data items that are logically a permanent part of the encoder
object (the configuration) and those that are ephemeral, being part of a
particular encoding operation (buffered output, current recursion depth,
seen-object table). The clean solution is for the encoding object to
contain only the configuration, and to instantiate the per-operation
variables once per operation, in a separate structure allocated on the
C stack.
I appreciate that you're trying to reuse the buffer for performance.
If you insist on that kind of reuse, you need to detect reentrancy
yourself and allocate separate variables specifically in the reentrant
case. If you do this, there's still no real reason to tie the variables
to the encoder structure: you can allocate the base variables in a totally
static way (or per Perl thread), and detect reentrancy regardless of
which encoding objects are being used.
I believe Sereal::Decoder has a similar reentrancy bug, but it doesn't
show up so readily. Whereas the encoder bug showed up with pretty much
the first thing I tried (after forming the suspicion that there might
be a fault of this nature), I don't have an example showing the decoder
misbehaving.
-zefram