Subject: | Issues around multipart boundaries. |
Date: | Fri, 24 Aug 2007 11:48:41 +0100 |
To: | bug-libwww-perl [...] rt.cpan.org |
From: | Graeme Thompson <Graeme.Thompson [...] mobilecohesion.com> |
Hi - I've ran into a few problems in HTTP::Daemon::ClientConn::get_request
which may be candidates for bugs.
libwww-perl-5.808
perl, v5.8.7 built for i686-linux-thread-multi
Linux 2.6.9-34.EL #1 Fri Feb 24 16:44:51 EST 2006 i686 i686 i386 GNU/Linux
The patch below makes four changes:
1 - When extracting the boundary from the Content-Type header, preserve the
original case. It was previously being converted to lower case, which
may then not match the boundary lines in the body of the message.
2 - Allow for the boundary value in the Content-Type header to be optionally
surrounded by double quotes.
3 - Do not demand that the closing "--boundary--" marker be followed by a
CRLF sequence. RFC 2046 defines it and the trailing epilogue as being
optional.
4 - Prefer to read the message based on Content-Length, rather than looking
for the closing multipart boundary. The reasoning here is that there may
be an epilogue of any length after the closing boundary which might not
be read if we stop at the "--boundary--". This would presumably be a
problem if we try reading another request from the same connection. So
if we know the entire message length in advance, make use if it.
Patch is included below for consideration.
diff -ru libwww-perl-5.808-dist/lib/HTTP/Daemon.pm libwww-perl-5.808/lib/HTTP/Daemon.pm
--- libwww-perl-5.808-dist/lib/HTTP/Daemon.pm 2007-07-19 22:24:31.000000000 +0100
+++ libwww-perl-5.808/lib/HTTP/Daemon.pm 2007-08-24 09:17:44.000000000 +0100
@@ -279,21 +279,6 @@
return;
}
- elsif ($ct && lc($ct) =~ m/^multipart\/\w+\s*;.*boundary\s*=\s*(\w+)/) {
- # Handle multipart content type
- my $boundary = "$CRLF--$1--$CRLF";
- my $index;
- while (1) {
- $index = index($buf, $boundary);
- last if $index >= 0;
- # end marker not yet found
- return unless $self->_need_more($buf, $timeout, $fdset);
- }
- $index += length($boundary);
- $r->content(substr($buf, 0, $index));
- substr($buf, 0, $index) = '';
-
- }
elsif ($len) {
# Plain body specified by "Content-Length"
my $missing = $len - length($buf);
@@ -312,6 +297,21 @@
$buf='';
}
}
+ elsif ($ct && $ct =~ m/^multipart\/\w+\s*;.*boundary\s*=\s*("?)(\w+)\1/i) {
+ # Handle multipart content type
+ my $boundary = "$CRLF--$2--";
+ my $index;
+ while (1) {
+ $index = index($buf, $boundary);
+ last if $index >= 0;
+ # end marker not yet found
+ return unless $self->_need_more($buf, $timeout, $fdset);
+ }
+ $index += length($boundary);
+ $r->content(substr($buf, 0, $index));
+ substr($buf, 0, $index) = '';
+
+ }
${*$self}{'httpd_rbuf'} = $buf;
$r;