Skip Menu |

This queue is for tickets about the Mail-Box CPAN distribution.

Report information
The Basics
Id: 20656
Status: resolved
Priority: 0/
Queue: Mail-Box

People
Owner: MARKOV [...] cpan.org
Requestors: JGMYERS [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in:
  • 2.060
  • 2.065
Fixed in: (no value)



Subject: overflowing C stack parsing long mime parameter values
When parsing a message with an extremely long (10,000s of characters) multipart boundary value, perl overflows the C stack when evaluating the regex in Mail::Message::Field::attribute. The regex causes perl to recurse for each character in the value. Attached is a test program which demonstrates the problem. It also includes my proposed fixed in a commented out section. My proposed fix also fixes a bug in the original parser. If the value had a quoted string which ends with a backslash-quoted backslash, the original code could parse past the final quote. For example: foo="val1\\"; bar="quux" Some other aspects of the parsing are nonconforming to RFC 2045, but appear as if they are deliberate: * The handling of single-quote quoted values is contrary to RFC 2045. Single quotes (') are not tspecials, so there is no basis for removing them from the parsed value. The entire part of the regex for handling single-quote-quoted values should be removed. * The function does not remove the backslash quotes from the parsed value. Instead of returning $+, the function should run s/(\\.)/$1/g on the parsed-out value before returning.
Subject: testdecode.pl
use threads; my $str = ' multipart/alternative; boundary="' . ('-' x 943) . ( ( " " . ('-' x 989)) x 139) . ( ( " " . ('\"' x 389)) x 139) . ( ( " " . ('\\' x 389)) x 139) . ( ( " " . ('-\"' x 389)) x 139) . ( ( " " . ('-\\' x 389)) x 139) . " " . ('-' x 302) . '====1153339803===="'; sub func { my $attr = 'boundary'; my $rv = $str =~ m/\b$attr\s*=\s* ( "( (?: [^"]|\\" )* )" | '( (?: [^']|\\' )* )' | ([^;\s]*) ) /xi ? $+ : undef; # my $rv = $str =~ m/\b$attr\s*=\s* # ( "( (?> [^\\"]*|\\. )* )" # | '( (?> [^\\']*|\\. )* )' # | ([^";\s]*) # ) # /xi ? $+ : undef; } my $thread = threads->create('func'); $thread->join(); warn "done";
On Mon Jul 24 16:51:58 2006, JGMYERS wrote: Show quoted text
> Attached is a test program which demonstrates the problem. It also > includes my proposed fixed in a commented out section.
Show quoted text
> * The handling of single-quote quoted values is contrary to RFC 2045. > Single quotes (') are not tspecials, so there is no basis for removing > them from the parsed value.
Show quoted text
> * The function does not remove the backslash quotes from the parsed > value. Instead of returning $+, the function should run s/(\\.)/$1/g on > the parsed-out value before returning.
So... if both are true, the correct parsing would need a more serious rewrite to the whole method. Can you confirm that this is correct: sub attribute($;$) { my ($self, $attr) = (shift, shift); my $body = $self->unfoldedBody; unless(@_) { if($body =~ m/\b$attr\s*\=\s* ( "( (?> [^\\"]*|\\. )* )" | ([^";\s]*) )/xi) { (my $val = $+) =~ s/\\"/"/g; return $val; } return undef; } my $value = shift; unless(defined $value) # remove attribute { for($body) { s/\b$attr\s*=\s*"(?:[^"]|\\")*"//i or s/\b$attr\s*=\s*[;\s]*//i; } $self->unfoldedBody($body); return undef; } (my $quoted = $value) =~ s/"/\\"/g; for($body) { s/\b$attr\s*=\s*"([^"]|\\")*"/$attr="$quoted"/i or s/\b$attr\s*=\s*[^;\s]*/$attr="$quoted"/i or do { $_ .= qq(; $attr="$quoted") } } $self->unfoldedBody($body); $value; }
From: JGMYERS [...] cpan.org
The "remove attribute" and "replace attribute" regexes need to be corrected like the "retrieve attribute" regex have been. $quoted needs to instead have a s/(["\\])/\\$1/g in order to also quote backslashes. The code will misparse boundary in: foo=" boundary=bad; "; boundary="good" but I don't know if that is worth fixing.
From: MARKOV [...] cpan.org
On Fri Jul 28 20:22:04 2006, JGMYERS wrote: Show quoted text
> The "remove attribute" and "replace attribute" regexes need to be > corrected like the "retrieve attribute" regex have been. $quoted needs > to instead have a s/(["\\])/\\$1/g in order to also quote backslashes.
ok, improved that... Show quoted text
> The code will misparse boundary in: > > foo=" boundary=bad; "; boundary="good" > > but I don't know if that is worth fixing.
If you use 'study' on fields, you will get the correct behavior. However, that has major performance implications...