Skip Menu |

This queue is for tickets about the Net-Async-HTTP CPAN distribution.

Report information
The Basics
Id: 72843
Status: resolved
Priority: 0/
Queue: Net-Async-HTTP

People
Owner: Nobody in particular
Requestors: TEAM [...] cpan.org
Cc: kiyoshi.aman [...] gmail.com
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.13
Fixed in: 0.16



Subject: GET.pl example fails for reddit.com
Seems that the GET.pl example is unsuccessful with some URLs, notably www.reddit.com. Suspect it may be the chunked transfer-encoding: GET http://www.reddit.com User-Agent: lwp-request/6.00 libwww-perl/6.02 200 OK Connection: close Connection: Transfer-Encoding Date: Wed, 30 Nov 2011 13:04:33 GMT Server: '; DROP TABLE servertypes; -- Content-Type: text/html; charset=UTF-8 Client-Date: Wed, 30 Nov 2011 13:04:36 GMT Client-Peer: 84.53.132.34:80 Client-Response-Num: 1 Client-Transfer-Encoding: chunked Link: <http://www.redditstatic.com/reddit.2ayUrwceAhU.css>; rel="stylesheet"; type="text/css" Link: <http://www.redditstatic.com/favicon.ico>; rel="shortcut icon"; type="image/x-icon" Link: <http://www.reddit.com/.rss>; rel="alternate"; title="RSS"; type="application/rss+xml" Set-Cookie: reddit_first=%7B%22organic_pos%22%3A%201%2C%20%22firsttime%22%3A%20%22first%22%7D; Domain=reddit.com; expires=Thu, 31 Dec 2037 23:59:59 GMT; Path=/ Title: reddit: the front page of the internet X-Meta-Description: reddit: the front page of the internet X-Meta-Keywords: reddit, reddit.com, vote, comment, submit X-Meta-Title: reddit: the front page of the internet X-Meta-Viewport: width=800, initial-scale=1 Doesn't call on_header or on_response, just sits there. Identical HTML content served from an apache instance works fine - if I get a chance later I'll update the ticket with more details on where it gets to. cheers, Tom
What that LWP output *doesn't* show is the original Transfer-Encoding line: Transfer-Encoding: chunked and that leading whitespace before 'chunked' is throwing off the eq check. GET.pl now works with the attached patch (against current bzr head), but I guess whitespace normalisation is probably needed for most if not all headers? cheers, Tom
Subject: 2011-11-30-chunked-encoding-strip-whitespace.patch
=== modified file 'lib/Net/Async/HTTP/Protocol.pm' --- lib/Net/Async/HTTP/Protocol.pm 2011-10-19 21:20:35 +0000 +++ lib/Net/Async/HTTP/Protocol.pm 2011-11-30 14:06:23 +0000 @@ -144,7 +144,7 @@ return undef; # Finished } - my $transfer_encoding = $header->header( "Transfer-Encoding" ); + (my $transfer_encoding = $header->header( "Transfer-Encoding" )) =~ s/\s*//g; my $content_length = $header->content_length; if( defined $transfer_encoding and $transfer_encoding eq "chunked" ) {
On Wed Nov 30 09:09:24 2011, TEAM wrote: Show quoted text
> What that LWP output *doesn't* show is the original Transfer-Encoding
line: Show quoted text
> > Transfer-Encoding: chunked > > and that leading whitespace before 'chunked' is throwing off the eq > check. GET.pl now works with the attached patch (against current bzr > head), but I guess whitespace normalisation is probably needed for
most Show quoted text
> if not all headers?
This change looks unsatisfactory. RFC 2616 has this to say: The field-content does not include any leading or trailing LWS: linear white space occurring before the first non-whitespace character of the field-value or after the last non-whitespace character of the field-value. Such leading or trailing LWS MAY be removed without changing the semantics of the field value. From the wording of that, I'd expect that HTTP::Message would handle the whitespace trimming for us. As a temporary workaround we probably ought to trim space around all the headers read from the request/response, but ideally we should also report this upstream to HTTP so that they can fix it there. -- Paul Evans
Show quoted text
> From the wording of that, I'd expect that HTTP::Message would handle
the Show quoted text
> whitespace trimming for us. As a temporary workaround we probably
ought Show quoted text
> to trim space around all the headers read from the request/response,
but Show quoted text
> ideally we should also report this upstream to HTTP so that they can
fix Show quoted text
> it there.
Now reported https://rt.cpan.org/Ticket/Display.html?id=75224 -- Paul Evans
On Tue Jan 17 11:50:48 2012, PEVANS wrote: Show quoted text
> From the wording of that, I'd expect that HTTP::Message would handle the > whitespace trimming for us. As a temporary workaround we probably ought > to trim space around all the headers read from the request/response
Now worked-around, see attached patch. Will be in next version (0.16). -- Paul Evans
Subject: rt72843.patch
=== modified file 'lib/Net/Async/HTTP/Protocol.pm' --- lib/Net/Async/HTTP/Protocol.pm 2012-02-06 21:58:23 +0000 +++ lib/Net/Async/HTTP/Protocol.pm 2012-02-23 13:01:24 +0000 @@ -1,7 +1,7 @@ # You may distribute under the terms of either the GNU General Public License # or the Artistic License (the same terms as Perl itself) # -# (C) Paul Evans, 2008-2011 -- leonerd@leonerd.org.uk +# (C) Paul Evans, 2008-2012 -- leonerd@leonerd.org.uk package Net::Async::HTTP::Protocol; @@ -22,6 +22,11 @@ use constant ON_READ => 0; use constant ON_ERROR => 1; +# Detect whether HTTP::Message properly trims whitespace in header values. If +# it doesn't, we have to deploy a workaround to fix them up. +# https://rt.cpan.org/Ticket/Display.html?id=75224 +use constant HTTP_MESSAGE_TRIMS_LWS => HTTP::Message->parse( "Name: value " )->header("Name") eq "value"; + =head1 NAME C<Net::Async::HTTP::Protocol> - HTTP client protocol handler @@ -128,6 +133,17 @@ } my $header = HTTP::Response->parse( $1 ); + + unless( HTTP_MESSAGE_TRIMS_LWS ) { + my @headers; + $header->scan( sub { + my ( $name, $value ) = @_; + s/^\s+//, s/\s+$// for $value; + push @headers, $name => $value; + } ); + $header->header( @headers ); + } + $header->request( $req ); $header->previous( $args{previous_response} ) if $args{previous_response};
Now released as 0.16. -- Paul Evans