Bug #30466 for Bencode: bug in Bencode.pm

Sun Nov 04 10:18:53 2007 DJosifovich [...] gmail.com - Ticket created

CC:	DJosifovich [...] gmail.com
Subject:	bug in Bencode.pm
Date:	Sun, 04 Nov 2007 08:18:30 -0700
To:	bug-bencode [...] rt.cpan.org
From:	DJosifovich [...] gmail.com

Hello; I downloaded your Bencode perl module this morning. package Bencode; =head1 NAME Bencode - BitTorrent serialisation format =head1 VERSION This document describes Bencode version 1.0 I found two bugs. First. If the a string has a newline (aka 0x0a aka ^J aka nl aka \n), then $str_rx does not match resulting in "garbage at message". I'm finding newlines in the peers values in the response from the tracker. They are part of an ipaddress and-or port. Second. If the string length is overly large, then the "garbage at message" is also seen. Example, "d4:test20:fooe". I would have expected a different or better error message. I found this bug while experimenting to find out what the problem was in the first bug. Is there an updated version? 1.0 seems unlikely to be the latest but it was what I currently found on some website. I go look around more. Best Regards, Dennis

Sun Nov 04 11:17:01 2007 ARISTOTLE [...] cpan.org - Correspondence added

From:

ARISTOTLE [...] cpan.org

Hi Dennis, first of all, thanks for your report. * <DJosifovich@gmail.com> [2007-11-04 10:18]: Show quoted text

> If the a string has a newline (aka 0x0a aka ^J aka nl aka \n), > then $str_rx does not match resulting in "garbage at message". > I'm finding newlines in the peers values in the response from > the tracker. They are part of an ipaddress and-or port.

Yes, I forgot to add an /s modifier to the $str_rx definition. For that matter, string-decoding should work differently. Now that I look at the code again after a long time away from it, it’s pretty clear to me that only the string-length match should be done using a regex; collecting the string and advancing the pointer should be done with substr() and pos(). I’ll implement this and push a new version to the CPAN. Show quoted text

> If the string length is overly large, then the "garbage at > message" is also seen. Example, "d4:test20:fooe". I would have > expected a different or better error message.

It cannot be a whole lot better in this case, because I cannot think of any way in which the decoder could surmise that the `e` at the end of your example is meant to terminate the string. In this particular case, I can have a check to see if the remainder of the serialised chunk is shorter than the string. But if there was more bencode data following a string with a bad length, there is nothing better that the code can tell you than “trailing garbage.” In fact, if the string length is just right, you might get corrupt decoded data and no error message at all. There is nothing I can do about that. Show quoted text

> Is there an updated version? 1.0 seems unlikely to be the > latest

Why? 1.0 *is* the latest version. I don’t expect this module to ever have a lot more releases either.

Sun Nov 04 11:17:02 2007 The RT System itself - Status changed from 'new' to 'open'

Sun Nov 04 11:17:09 2007 ARISTOTLE [...] cpan.org - Taken

Sun Nov 04 11:32:25 2007 DJosifovich [...] gmail.com - Correspondence added

Subject:	Re: [rt.cpan.org #30466] bug in Bencode.pm
Date:	Sun, 04 Nov 2007 09:32:04 -0700
To:	bug-Bencode [...] rt.cpan.org
From:	djosifovich [...] gmail.com

Hello; [] first of all, thanks for your report. No problem. [] * <DJosifovich@gmail.com> [2007-11-04 10:18]: [] > If the a string has a newline (aka 0x0a aka ^J aka nl aka \n), [] > then $str_rx does not match resulting in "garbage at message". [] > I'm finding newlines in the peers values in the response from [] > the tracker. They are part of an ipaddress and-or port. [] [] Yes, I forgot to add an /s modifier to the $str_rx definition. I'll give that a try to keep me going. [] For that matter, string-decoding should work differently. Now [] that I look at the code again after a long time away from it, [] itâs pretty clear to me that only the string-length match should [] be done using a regex; collecting the string and advancing the [] pointer should be done with substr() and pos(). [] [] Iâll implement this and push a new version to the CPAN. That sounds good. [] > If the string length is overly large, then the "garbage at [] > message" is also seen. Example, "d4:test20:fooe". I would have [] > expected a different or better error message. [] [] It cannot be a whole lot better in this case, because I cannot [] think of any way in which the decoder could surmise that the `e` [] at the end of your example is meant to terminate the string. In [] this particular case, I can have a check to see if the remainder [] of the serialised chunk is shorter than the string. No, I didn't mean to imply you should try to use the 'e' to imply anything. I just used an example that I tried & had a problem. I should have showed you one without the 'e'. If you are using substr() & pos(), then you might be able to say something like "insufficient data for string of length %d". [] But if there was more bencode data following a string with a bad [] length, there is nothing better that the code can tell you than [] âtrailing garbage.â In fact, if the string length is just right, [] you might get corrupt decoded data and no error message at all. [] There is nothing I can do about that. Yes, I can see how a cleverly crafted bencoded string could cause issues. I was more focused on my issue with the numeric length value being larger than the "data available" (in my case before the \n). Thanks, Dennis

Sun Nov 04 11:59:44 2007 DJosifovich [...] gmail.com - Correspondence added

Subject:	Re: [rt.cpan.org #30466] bug in Bencode.pm
Date:	Sun, 04 Nov 2007 09:59:19 -0700
To:	bug-Bencode [...] rt.cpan.org
From:	djosifovich [...] gmail.com

Hello Again; [] * <DJosifovich@gmail.com> [2007-11-04 10:18]: [] > If the a string has a newline (aka 0x0a aka ^J aka nl aka \n), [] > then $str_rx does not match resulting in "garbage at message". [] > I'm finding newlines in the peers values in the response from [] > the tracker. They are part of an ipaddress and-or port. [] [] Yes, I forgot to add an /s modifier to the $str_rx definition. I just have to ask. Doesn't the $str_rx contain a substring like (?s).{300} if the bencoded string length is expected to be 300 octets? And the (?s) portion should enable multiline matching, right? So now I'm confused as to why it doesn't match newlines. Best Regards, Dennis

Sun Nov 04 13:43:28 2007 ARISTOTLE [...] cpan.org - Correspondence added

1.2 is on its way to the CPAN. I’ve made all the changes mentioned, I fixed that bug, and also improved the error reporting in a variety of other circumstances. Please test it and see if it helps. If not, please reply; until further notice, I consider this ticket resolved.

Sun Nov 04 13:43:30 2007 ARISTOTLE [...] cpan.org - Status changed from 'open' to 'resolved'

Bug #30466 for Bencode: bug in Bencode.pm

Preferred bug tracker