Skip Menu |

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 13279
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: talby [...] trap.mtview.ca.us
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 5.803
Fixed in: (no value)



Subject: HTTP Response parse bug in LWP::Protocol
This is an odd case, but I think the fix is pretty trivial. Here's what I'm hoping for: On line 110 of LWP::Protocol, in collect(), the line: if (!defined($arg) || (!$response->is_success) { might be better as: if (!defined($arg) || (!$response->is_success and !$response->is_redirect)) { Now, here's the long story of why: I recently ran across a malformed response from a public HTTP server. When I send: GET /talbycodes?on_url=http://online.com/&off_url=http://offline.com/ HTTP/1.1 Host: big.oscar.aol.com User-Agent: unimportant Keep-Alive: 300 to port 80 on big.oscar.aol.com It responds with: HTTP/1.1 302 Redirection Location: http://online.com/ IMG SRC=http://online.com/ but because I requested Keep-Alive it keeps the connection open waiting for another request. The stupid thing doesn't give me a content length, or chunk it's content, so I have no idea this request is finished until I wait for the socket to close. When this request is placed through LWP::UserAgent with say: my $ua = LWP::UserAgent->new( keep_alive => 5, timeout => 10, agent => 'unimportant', ); my($user, $on, $off) = qw(talbycodes http://site.com/on http://site.com/off); my $q = HTTP::Request->new( GET => "http://big.oscar.aol.com/$user?on_url=$on&off_url=$off", ); print $ua->send_request($q)->as_string; I get this response from the script: 500 (Internal Server Error) read timeout Content-Type: text/plain Client-Date: Thu, 16 Jun 2005 18:50:50 GMT Client-Warning: Internal response Which makes sense, but if I instead use: my $i = 0; $ua->send_request($q, sub { my($chunk, $res, $proto) = @_; print $res->as_string unless $i++; print "$chunk\n" }); It would be nice to be able to see the content-so-far because for a 302, it doesn't matter what the content is and upon seeing the headers I could have closed the connection and moved on (as browsers appear to do). My problem is that the content_cb doesn't get called in this case because $response->is_success is not true for this response, so maybe $response->is_redirect should also count for content streaming in LWP::Protocol::collect(). Alternatively, what I'd really like is a callback soon as the headers have been read. Most of the time I add a content_cb it's to collect the headers early.
On Thu Jun 16 15:19:15 2005, guest wrote: Show quoted text
> This is an odd case, but I think the fix is pretty trivial. Here's > what I'm hoping for: On line 110 of LWP::Protocol, in collect(), > the line: > > if (!defined($arg) || (!$response->is_success) { > > might be better as: > > if (!defined($arg) || > (!$response->is_success and !$response->is_redirect)) { > > Now, here's the long story of why: > > I recently ran across a malformed response from a public HTTP server. > When I send: > > GET /talbycodes?on_url=http://online.com/&off_url=http://offline.com/ > HTTP/1.1 > Host: big.oscar.aol.com > User-Agent: unimportant > Keep-Alive: 300 > > to port 80 on big.oscar.aol.com > > It responds with: > > HTTP/1.1 302 Redirection > Location: http://online.com/ > > IMG SRC=http://online.com/ > > but because I requested Keep-Alive it keeps the connection open > waiting for another request. The stupid thing doesn't give me a > content length, or chunk it's content, so I have no idea this > request is finished until I wait for the socket to close. > > When this request is placed through LWP::UserAgent with say: > > my $ua = LWP::UserAgent->new( > keep_alive => 5, > timeout => 10, > agent => 'unimportant', > ); > my($user, $on, $off) = qw(talbycodes http://site.com/on > http://site.com/off); > my $q = HTTP::Request->new( > GET => "http://big.oscar.aol.com/$user?
on_url=$on&off_url=$off", Show quoted text
> ); > print $ua->send_request($q)->as_string; > > I get this response from the script: > > 500 (Internal Server Error) read timeout > Content-Type: text/plain > Client-Date: Thu, 16 Jun 2005 18:50:50 GMT > Client-Warning: Internal response > > Which makes sense, but if I instead use: > > my $i = 0; > $ua->send_request($q, sub { > my($chunk, $res, $proto) = @_; > print $res->as_string unless $i++; > print "$chunk\n" > }); > > It would be nice to be able to see the content-so-far because for a > 302, it doesn't matter what the content is and upon seeing the > headers I could have closed the connection and moved on (as > browsers appear to do). > > My problem is that the content_cb doesn't get called in this case > because $response->is_success is not true for this response, so > maybe $response->is_redirect should also count for content > streaming in LWP::Protocol::collect(). Alternatively, what I'd > really like is a callback soon as the headers have been read.
Most Show quoted text
> of the time I add a content_cb it's to collect the headers early.
From: talby [...] trap.mtview.ca.us
Upon reviewing this in light of having reread RFC 2616, Sec 4.3 under the interpretation I think is valid in light of bug #17907, I think possibly, the LWP::Protocol::collect() code might not need to treat ->is_success responses differently from other response types at all. Those that have content should have a Content-Length or a Transfer-Encoding header, those that do not, should not have the header unless the request method was HEAD. I suspect the response code doesn't need to factor into how the response is processed here. This is just a hypothesis, but I will try to find some counter-examples. Anyhow, just food for thought. Thanks for your time.
On Thu Jun 16 15:19:15 2005, guest wrote: Show quoted text
> This is an odd case, but I think the fix is pretty trivial. Here's > what I'm hoping for: On line 110 of LWP::Protocol, in collect(), > the line: > > if (!defined($arg) || (!$response->is_success) { > > might be better as: > > if (!defined($arg) || > (!$response->is_success and !$response->is_redirect)) { > > Now, here's the long story of why: > > I recently ran across a malformed response from a public HTTP server. > When I send: > > GET /talbycodes?on_url=http://online.com/&off_url=http://offline.com/ > HTTP/1.1 > Host: big.oscar.aol.com > User-Agent: unimportant > Keep-Alive: 300 > > to port 80 on big.oscar.aol.com > > It responds with: > > HTTP/1.1 302 Redirection > Location: http://online.com/ > > IMG SRC=http://online.com/ > > but because I requested Keep-Alive it keeps the connection open > waiting for another request. The stupid thing doesn't give me a > content length, or chunk it's content, so I have no idea this > request is finished until I wait for the socket to close. > > When this request is placed through LWP::UserAgent with say: > > my $ua = LWP::UserAgent->new( > keep_alive => 5, > timeout => 10, > agent => 'unimportant', > ); > my($user, $on, $off) = qw(talbycodes http://site.com/on > http://site.com/off); > my $q = HTTP::Request->new( > GET => "http://big.oscar.aol.com/$user?
on_url=$on&off_url=$off", Show quoted text
> ); > print $ua->send_request($q)->as_string; > > I get this response from the script: > > 500 (Internal Server Error) read timeout > Content-Type: text/plain > Client-Date: Thu, 16 Jun 2005 18:50:50 GMT > Client-Warning: Internal response > > Which makes sense, but if I instead use: > > my $i = 0; > $ua->send_request($q, sub { > my($chunk, $res, $proto) = @_; > print $res->as_string unless $i++; > print "$chunk\n" > }); > > It would be nice to be able to see the content-so-far because for a > 302, it doesn't matter what the content is and upon seeing the > headers I could have closed the connection and moved on (as > browsers appear to do). > > My problem is that the content_cb doesn't get called in this case > because $response->is_success is not true for this response, so > maybe $response->is_redirect should also count for content > streaming in LWP::Protocol::collect(). Alternatively, what I'd > really like is a callback soon as the headers have been read.
Most Show quoted text
> of the time I add a content_cb it's to collect the headers early.
On Thu Jun 16 15:19:15 2005, guest wrote: Show quoted text
> This is an odd case, but I think the fix is pretty trivial. Here's > what I'm hoping for: On line 110 of LWP::Protocol, in collect(), > the line: > > if (!defined($arg) || (!$response->is_success) { > > might be better as: > > if (!defined($arg) || > (!$response->is_success and !$response->is_redirect)) { > > Now, here's the long story of why: > > I recently ran across a malformed response from a public HTTP server. > When I send: > > GET /talbycodes?on_url=http://online.com/&off_url=http://offline.com/ > HTTP/1.1 > Host: big.oscar.aol.com > User-Agent: unimportant > Keep-Alive: 300 > > to port 80 on big.oscar.aol.com > > It responds with: > > HTTP/1.1 302 Redirection > Location: http://online.com/ > > IMG SRC=http://online.com/ > > but because I requested Keep-Alive it keeps the connection open > waiting for another request. The stupid thing doesn't give me a > content length, or chunk it's content, so I have no idea this > request is finished until I wait for the socket to close. > > When this request is placed through LWP::UserAgent with say: > > my $ua = LWP::UserAgent->new( > keep_alive => 5, > timeout => 10, > agent => 'unimportant', > ); > my($user, $on, $off) = qw(talbycodes http://site.com/on > http://site.com/off); > my $q = HTTP::Request->new( > GET => "http://big.oscar.aol.com/$user?
on_url=$on&off_url=$off", Show quoted text
> ); > print $ua->send_request($q)->as_string; > > I get this response from the script: > > 500 (Internal Server Error) read timeout > Content-Type: text/plain > Client-Date: Thu, 16 Jun 2005 18:50:50 GMT > Client-Warning: Internal response > > Which makes sense, but if I instead use: > > my $i = 0; > $ua->send_request($q, sub { > my($chunk, $res, $proto) = @_; > print $res->as_string unless $i++; > print "$chunk\n" > }); > > It would be nice to be able to see the content-so-far because for a > 302, it doesn't matter what the content is and upon seeing the > headers I could have closed the connection and moved on (as > browsers appear to do). > > My problem is that the content_cb doesn't get called in this case > because $response->is_success is not true for this response, so > maybe $response->is_redirect should also count for content > streaming in LWP::Protocol::collect(). Alternatively, what I'd > really like is a callback soon as the headers have been read.
Most Show quoted text
> of the time I add a content_cb it's to collect the headers early.