Subject: | overflowing C stack parsing long mime parameter values |
When parsing a message with an extremely long (10,000s of characters)
multipart boundary value, perl overflows the C stack when evaluating the
regex in Mail::Message::Field::attribute. The regex causes perl to
recurse for each character in the value.
Attached is a test program which demonstrates the problem. It also
includes my proposed fixed in a commented out section.
My proposed fix also fixes a bug in the original parser. If the value
had a quoted string which ends with a backslash-quoted backslash, the
original code could parse past the final quote. For example:
foo="val1\\"; bar="quux"
Some other aspects of the parsing are nonconforming to RFC 2045, but
appear as if they are deliberate:
* The handling of single-quote quoted values is contrary to RFC 2045.
Single quotes (') are not tspecials, so there is no basis for removing
them from the parsed value. The entire part of the regex for handling
single-quote-quoted values should be removed.
* The function does not remove the backslash quotes from the parsed
value. Instead of returning $+, the function should run s/(\\.)/$1/g on
the parsed-out value before returning.
Subject: | testdecode.pl |
use threads;
my $str = ' multipart/alternative; boundary="' .
('-' x 943) .
( ( " " . ('-' x 989)) x 139) .
( ( " " . ('\"' x 389)) x 139) .
( ( " " . ('\\' x 389)) x 139) .
( ( " " . ('-\"' x 389)) x 139) .
( ( " " . ('-\\' x 389)) x 139) .
" " . ('-' x 302) . '====1153339803===="';
sub func {
my $attr = 'boundary';
my $rv = $str =~ m/\b$attr\s*=\s*
( "( (?: [^"]|\\" )* )"
| '( (?: [^']|\\' )* )'
| ([^;\s]*)
)
/xi ? $+ : undef;
# my $rv = $str =~ m/\b$attr\s*=\s*
# ( "( (?> [^\\"]*|\\. )* )"
# | '( (?> [^\\']*|\\. )* )'
# | ([^";\s]*)
# )
# /xi ? $+ : undef;
}
my $thread = threads->create('func');
$thread->join();
warn "done";