Subject: | HTTP::Message is not handling malformed HTTP headers |
in libwww-perl-5.810
http://www.lochsidelodge.com/
http://www.meubelmakerijhetganker.nl/
Both of these sites return headers that do not conform to the standard
(RFV 2616) and break the parsing in HTTP::Messags->parse(). When the
pages are fetched with LWP the parser Net:HTTP::read_response_headers is
used that has forgiving parsing.
I'm attaching the header data for http://www.lochsidelodge.com/ and a
simple test script to trigger the bug.
Here is a suggested fix for HTTP::Messags->parse().
sub parse
{
my($class, $str) = @_;
my $valid_prev_key = 0;
my @hdr;
while (1) {
if ($str =~
s/^([^\x00-\x20\x7f()<>@,;:\\\"\/\[\]?={}]+)\s*:\s+(.*?)\n//) {
push(@hdr, $1, $2);
$hdr[-1] =~ s/\r\z//;
$valid_prev_key = 1;
}
elsif ($valid_prev_key && $str =~ s/^([ \t].*?)\n//) {
$hdr[-1] .= "\n$1";
$hdr[-1] =~ s/\r\z//;
}
elsif ($str !~ /^\r?\n/) {
$str =~ s/^(.+?)\n//;
# warn("malformed http header line, skipping.");
$valid_prev_key = 0;
}
else {
$str =~ s/^\r?\n//;
last;
}
}
new($class, \@hdr, $str);
}
Subject: | lochsidelodge.com.headers |
Message body not shown because it is not plain text.
Subject: | http_parse_test.pl |
#!/usr/bin/perl
use HTTP::Message;
use Data::Dumper qw(Dumper);
while(<>) { $h .= $_; }
$h =~ s/^HTTP.+?\n//;
my $header = HTTP::Message->parse($h);
print Dumper($header);