Subject: | Unbounded recursion in bdecode |
The bdecode function in Bencode-1.31 will recurse as deeply as required
by the bencoded input string. This can cause "deep recursion" warnings
for trivially small bencoded input and can cause memory starvation with
longer inputs. If the bdecode function is to be used with untrusted
input (such as a user provided torrent file), this has to be resolved.
The attached patch adds a third parameter to bdecode called $max_depth.
The parameter is optional and follows the existing optional
$do_lenient_decode parameter in order to preserve the old (unsafe)
behaviour for unmodified client code. If this parameter is set to an
integer, bdecode will refuse to parse dictionaries nested deeper than
this level by croaking with the message "nested too deep at %s".
Patch includes documentation changes and additional tests.
To demonstrate the problem:
$ perl -We 'use Bencode qw(bdecode); my $depth = 99; bdecode(("d1:a" x
$depth)."0:".("e" x $depth));'; echo $?
Deep recursion on subroutine "Bencode::_bdecode_chunk" at Bencode.pm
line 97.
0
$ free -m
total used free shared buffers cached
Mem: 8000 5898 2102 0 56 3531
-/+ buffers/cache: 2309 5691
Swap: 19072 31 19041
$ perl -We 'use Bencode qw(bdecode); my $depth = 25000; bdecode(("d1:a"
x $depth)."0:".("e" x $depth));'; echo $?
Deep recursion on subroutine "Bencode::_bdecode_chunk" at Bencode.pm
line 97.
Segmentation fault
139
Using the patched Bencode.pm, and modifying to use the new $max_depth
parameter we instead get:
$ perl -We 'use Bencode qw(bdecode); my $depth = 99; bdecode(("d1:a" x
$depth)."0:".("e" x $depth), 0, 98);'; echo $?
nested too deep at 393 at -e line 1
255
System info:
$ uname -a
Linux vulcan 2.6.32-23-generic #37-Ubuntu SMP Fri Jun 11 08:03:28 UTC
2010 x86_64 GNU/Linux
$ perl -v
This is perl, v5.10.1 (*) built for x86_64-linux-gnu-thread-multi
Thanks for your work on this module.
Regards,
Day Barr
Subject: | Bencode.patch |
diff -r -u Bencode-1.31/lib/Bencode.pm Bencode-1.31-new/lib/Bencode.pm
--- Bencode-1.31/lib/Bencode.pm 2007-11-05 04:19:10.000000000 +0000
+++ Bencode-1.31-new/lib/Bencode.pm 2010-07-06 13:19:54.006405362 +0100
@@ -3,7 +3,7 @@
use Carp;
use Exporter;
-use vars qw( $VERSION @ISA @EXPORT_OK $DEBUG $do_lenient_decode );
+use vars qw( $VERSION @ISA @EXPORT_OK $DEBUG $do_lenient_decode $max_depth );
$VERSION = '1.31';
@@ -38,6 +38,8 @@
}
sub _bdecode_chunk {
+ my $depth = shift;
+ $depth++;
warn _msg 'decoding at %s' if $DEBUG;
if ( defined( my $str = _bdecode_string() ) ) {
@@ -54,15 +56,23 @@
}
elsif ( m/ \G l /xgc ) {
warn _msg 'LIST' if $DEBUG;
+
+ croak _msg 'nested too deep at %s'
+ if defined $max_depth and $depth > $max_depth;
+
my @list;
until ( m/ \G e /xgc ) {
warn _msg 'list not terminated at %s, looking for another element' if $DEBUG;
- push @list, _bdecode_chunk();
+ push @list, _bdecode_chunk($depth);
}
return \@list;
}
elsif ( m/ \G d /xgc ) {
warn _msg 'DICT' if $DEBUG;
+
+ croak _msg 'nested too deep at %s'
+ if defined $max_depth and $depth > $max_depth;
+
my $last_key;
my %hash;
until ( m/ \G e /xgc ) {
@@ -84,7 +94,7 @@
if m/ \G e /xgc;
$last_key = $key;
- $hash{ $key } = _bdecode_chunk();
+ $hash{ $key } = _bdecode_chunk($depth);
}
return \%hash;
}
@@ -96,7 +106,8 @@
sub bdecode {
local $_ = shift;
local $do_lenient_decode = shift;
- my $deserialised_data = _bdecode_chunk();
+ local $max_depth = shift;
+ my $deserialised_data = _bdecode_chunk(0);
croak _msg 'trailing garbage at %s' if $_ !~ m/ \G \z /xgc;
return $deserialised_data;
}
@@ -162,12 +173,14 @@
Croaks on unhandled data types.
-=head2 C<bdecode( $string [, $do_lenient_decode ] )>
+=head2 C<bdecode( $string [, $do_lenient_decode [, $max_depth ] ] )>
Takes a string and returns the corresponding deserialised data structure.
If you pass a true value for the second option, it will disregard the sort order of dict keys. This violation of the I<becode> format is somewhat common.
+If you pass an integer for the third option, it will refuse to parse dictionaries nested deeper than this level, by croaking with an appropriate message. Avoids "Deep recursion" warnings and potential out of memory situations resulting from maliciously crafted input data.
+
Croaks on malformed data.
=head1 DIAGNOSTICS
@@ -216,6 +229,10 @@
Your data contains a dictionary with an odd number of elements.
+=item C<nested too deep at %s>
+
+Your data contains dicts or lists that are nested deeper than the $max_depth passed to C<bdecode()>.
+
=item C<unhandled data type>
You are trying to serialise a data structure that consists of data types other than
Only in Bencode-1.31-new/lib: .Bencode.pm.swp
diff -r -u Bencode-1.31/t/01.bdecode.t Bencode-1.31-new/t/01.bdecode.t
--- Bencode-1.31/t/01.bdecode.t 2007-11-05 04:19:10.000000000 +0000
+++ Bencode-1.31-new/t/01.bdecode.t 2010-07-06 13:45:35.295230229 +0100
@@ -50,16 +50,40 @@
'l-3:e' => \[ qr/\Amalformed string length at 1/, 'list with negative-length string' ],
'i-03e' => \[ qr/\Amalformed integer data at 1/, 'negative integer with leading zero' ],
"2:\x0A\x0D" => "\x0A\x0D",
+ ['d1:a0:e', 0, 1] => { a => '' }, # Accept single dict when max_depth is 1
+ ['d1:a0:e', 0, 0] => \[ qr/\Anested too deep at 1/, 'single dict when max_depth is 0' ],
+ ['d1:ad1:a0:ee', 0, 2] => { a => { a => '' } }, # Accept a nested dict when max_depth is 2
+ ['d1:ad1:a0:ee', 0, 1] => \[ qr/\Anested too deep at 5/, 'nested dict when max_depth is 1' ],
+ ['l0:e', 0, 1] => [ '' ], # Accept single list when max_depth is 1
+ ['l0:e', 0, 0] => \[ qr/\Anested too deep at 1/, 'single list when max_depth is 0' ],
+ ['ll0:ee', 0, 2] => [ [ '' ] ], # Accept a nested list when max_depth is 2
+ ['ll0:ee', 0, 1] => \[ qr/\Anested too deep at 2/, 'nested list when max_depth is 1' ],
+ ['d1:al0:ee', 0, 2] => { a => [ '' ] }, # Accept dict containing list when max_depth is 2
+ ['d1:al0:ee', 0, 1] => \[ qr/\Anested too deep at 5/, 'list in dict when max_depth is 1' ],
+ ['ld1:a0:ee', 0, 2] => [ { 'a' => '' } ], # Accept list containing dict when max_depth is 2
+ ['ld1:a0:ee', 0, 1] => \[ qr/\Anested too deep at 2/, 'dict in list when max_depth is 1' ],
+ ['d1:a0:1:bl0:ee', 0, 2] => { a => '', b => [ '' ] }, # Accept dict containing list when max_depth is 2
+ ['d1:a0:1:bl0:ee', 0, 1] => \[ qr/\Anested too deep at 10/, 'list in dict when max_depth is 1' ],
);
plan tests => 1 + @test / 2;
while ( my ( $frozen, $thawed ) = splice @test, 0, 2 ) {
my $result;
- my $lived = eval { $result = bdecode( $frozen ); 1 };
+ my $testname;
+ my $lived = eval {
+ if ( ref $frozen eq 'ARRAY' ) {
+ $testname = "decode [".join(', ', @$frozen)."]";
+ $result = bdecode( @$frozen );
+ } else {
+ $testname = "decode '$frozen'";
+ $result = bdecode( $frozen );
+ }
+ 1
+ };
if( ref $thawed ne 'REF' ) {
- is_deeply( $result, $thawed, "decode '$frozen'" )
+ is_deeply( $result, $thawed, $testname );
}
else {
my ( $error_rx, $kind_of_brokenness ) = @$$thawed;