Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Sereal-Decoder CPAN distribution.

Report information
The Basics
Id: 93563
Status: resolved
Priority: 0/
Queue: Sereal-Decoder

People
Owner: Nobody in particular
Requestors: zefram [...] fysh.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: recursive state snafu
Date: Thu, 6 Mar 2014 13:09:51 +0000
To: bug-Sereal-Decoder [...] rt.cpan.org
From: Zefram <zefram [...] fysh.org>
Sereal::Decoder->decode isn't reentrant, for calls on a single decoder object. This can be seen by recursively decoding using the same decoder object from a THAW callback, with data involving a backreference: use Sereal::Encoder; use Sereal::Decoder; my $dec = Sereal::Decoder->new; package Foo { sub FREEZE { Sereal::Encoder->new->encode({%{$_[0]}}) } sub THAW { bless($dec->decode($_[2]), $_[0]) } } my $z = [ bless({a=>1},"Foo") ]; push @$z, $z; my $a = Sereal::Encoder->new({freeze_callbacks=>1})->encode($z); my $b = $dec->decode($a); (Never mind that this FREEZE/THAW pair is somewhat silly. This is just an example.) With the decoder thus shared, the backreference tracking data gets clobbered, and a SEGV results. If the two ->decode calls use separate decoder objects, by duplicating the "my $dec" line just above "my $b", it all works. The underlying bug is that the srl_decoder_t structure doesn't distinguish between data items that are logically a permanent part of the decoder object (the configuration) and those that are ephemeral, being part of a particular decoding operation (buffered input, current recursion depth, seen-object table). The clean solution is for the decoding object to contain only the configuration, and to instantiate the per-operation variables once per operation, in a separate structure allocated on the C stack. As for Sereal::Encoder, you might possibly have some advantage in avoiding reallocating the variables for every call. Probably much less advantage than S:E gets, though, because the decoder doesn't allocate its own buffer. If there really is advantage in keeping the variables between decoding operations, you need to detect reentrancy yourself and allocate separate variables specifically in the reentrant case. If you do this, there's still no real reason to tie the variables to the decoder structure: you can allocate the base variables in a totally static way (or per Perl thread), and detect reentrancy regardless of which decoding objects are being used. -zefram
Confirmed. The decoder doesn't have the same reentrance-detection & lazy-cloning logic as the encoder. Having it jump through those hoops would be one way to address this problem (as far as I can tell so far). Won't get much further with this today, though.
I just implemented cloning behaviour for the decoder symmetric to the one in the encoder. This appears to fix this bug. The main change is only in a branch right now until it's received review from at least Yves: smueller/reentrant-decoder I'll put off releasing encoder and decoder distributions until this branch is merged (or another/better fix is found) since releasing one of them is 80% of the work of releasing the both of them. Thanks again for reporting the issue and sorry for causing work for you.
Just a release is pending. Will do as I churn through the other outstanding discussions in related RT tickets.