Subject: | Reference tracking does not handle temporaries causing random corruption |
Dist: Data-Dump-1.23
Perl v5.18.2 built for x86_64-linux-thread-multi
Uname: Linux 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 GNU/Linux
There is a feature in this distribution which attempts to track references but it does not play nicely with the filter feature which allows the user to replace visited objects with other objects that often have a temporary lifetime.
Also, the dump function itself forces creation of temporary scalar references for non-ref input which only has scope for the current function.
sub _dump
{
my $ref = ref $_[0];
my $rval = $ref ? $_[0] : \$_[0];
The reason this all causes problems is that the reference tracking logic stashes away the address of the reference in a map which later is used to identify cases where references are encountered multiple times.
Perl's garbage collection logic it seems is fairly efficient and tries to re-use objects which have already been allocated previously but no longer in use for later created objects.
So, what happens I think is that the 'seen' logic captures addresses of temporaries which are then de-allocated and resused for subsequent altogether different object. But the 'seen' logic thinks they are reference matches.
I've attached a script that will reproduce the problem. In this script I'm basically splitting a string into tokens and returning an array ref with the tokens as items. These are all 'temporaries'.
Here is the input data structure:
my $test = {
'one' => 'bg ha cc de af eg fa gw ik',
'two' => 'cl jq nw tk jm ro qd rr apg',
'three' => 'fw rs jy iu wt vv xx wz',
};
If you run the example on current versions without the fix you should see something like the following:
do {
my $a = {
one => ["af", "bg", "cc", "de", "eg", "fa", "gw", "ha", "ik"],
three => ['fix', 'fix', 'fix', 'fix', 'fix', 'fix', "wz", "xx"],
two => ['fix', 'fix', 'fix', 'fix', 'fix', "qd", 'fix', "rr", "tk"],
};
$a->{three}[0] = $a->{one}[3];
$a->{three}[1] = $a->{one}[4];
$a->{three}[2] = $a->{one}[5];
$a->{three}[3] = $a->{one}[6];
$a->{three}[4] = $a->{one}[7];
$a->{three}[5] = $a->{one}[8];
$a->{two}[0] = $a->{one}[6];
$a->{two}[1] = $a->{one}[7];
$a->{two}[2] = $a->{one}[8];
$a->{two}[3] = $a->{three}[6];
$a->{two}[4] = $a->{three}[7];
$a->{two}[6] = $a->{one}[0];
$a;
}
The correct output with the proposed fix is:
{
one => ["af", "bg", "cc", "de", "eg", "fa", "gw", "ha", "ik"],
three => ["fw", "iu", "jy", "rs", "vv", "wt", "wz", "xx"],
two => ["apg", "cl", "jm", "jq", "nw", "qd", "ro", "rr", "tk"],
}
The fix for this is really pretty simple. We just need to keep the temporary references alive so the objects are not destroyed and re-used. The easiest way to do that is to store the object references (which may be temporaries) in the seen map as one of the values.
Patch is attached, quick summary as follows for Dump.pm:181:
- $seen{$id} = [$name, $idx];
+ $seen{$id} = [$name, $idx, $rval];
Subject: | data-dump-bug.pl |
#!/usr/bin/env perl
use warnings;
use strict;
use lib 'tt/Data-Dump-1.23/lib';
use Data::Dump::Filtered qw(add_dump_filter);
use Data::Dump;
my $test = {
'one' => 'bg ha cc de af eg fa gw ik',
'two' => 'cl jq nw tk jm ro qd rr apg',
'three' => 'fw rs jy iu wt vv xx wz',
};
add_dump_filter( \&filter_callback);
dd($test);
sub filter_callback {
my($ctx, $object_ref) = @_;
if ($ctx->is_scalar) {
my $line = ${$object_ref};
my @lines = split(/\s+/, $line);
if (@lines > 1) {
return {object => [sort @lines]};
} else {
return {object => $lines[0]};
}
}
return undef;
}
Subject: | dump.patch |
--- Dump.pm 2016-05-26 11:58:12.680638000 -0400
+++ Dump.pm.new 2016-05-26 11:58:05.308304000 -0400
@@ -178,7 +178,7 @@
return "do{my \$fix}" if @$idx && $idx->[-1] eq '$';
return "'fix'";
}
- $seen{$id} = [$name, $idx];
+ $seen{$id} = [$name, $idx, $rval];
}
if ($class) {