Subject: | HTTP::Response->parse is too keen to preserve whitespace |
RFC 2616 says that leading/trailing whitespace in header values ought to be ignored. I notice that currently it isn't:
$ perl -Mblib -MHTTP::Response -E 'printf ">%s<\n", HTTP::Response->parse("200 HTTP/1.1 OK\r\nName: value \r\n\r\n")->header("Name")'
Show quoted text
> value <
If I apply this patch (also attached)
--- lib/HTTP/Message.pm 2012-02-22 19:10:02 +0000
+++ lib/HTTP/Message.pm 2012-02-22 20:02:37 +0000
@@ -55,13 +55,13 @@
my @hdr;
while (1) {
- if ($str =~ s/^([^\s:]+)[ \t]*: ?(.*)\n?//) {
+ if ($str =~ s/^([^\s:]+)[ \t]*:[ \t]*(.*)\n?//) {
push(@hdr, $1, $2);
- $hdr[-1] =~ s/\r\z//;
+ $hdr[-1] =~ s/[ \t]*\r\z//;
}
elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
$hdr[-1] .= "\n$1";
- $hdr[-1] =~ s/\r\z//;
+ $hdr[-1] =~ s/[ \t]*\r\z//;
}
else {
$str =~ s/^\r?\n//;
Then it gives the output as I expected:
$ perl -Mblib -MHTTP::Response -E 'printf ">%s<\n", HTTP::Response->parse("200 HTTP/1.1 OK\r\nName: value \r\n\r\n")->header("Name")'
Show quoted text>value<
However, this breaks a few unit tests that appear to be testing precisely that this whitespace is preserved, not trimmed. I'm not quite sure
what the intended use or behaviour of ->parse is, but it appears to be inconsistent with RFC 2616.
This is currently causing a few failures of Net::Async::HTTP, which uses HTTP::Response->parse to parse the HTTP response header from the
server. Some less-well-behaved servers are observed to include extra whitespace, thus breaking such logic as:
if( $response->header("Connection") eq "close" ) { ... }
Is HTTP::Response->parse the correct method to use in this case? If so I think it ought to trim whitespace as 2616 suggests. If not, can you
provide a new method that does, or some way to handle HTTP response parsing?
Thanks
--
Paul Evans