Subject: | Bug with SSL against keep-alive servers with long enough timeouts |
Date: | Mon, 25 Jan 2016 23:52:30 +0000 |
To: | bug-POE-Component-Client-HTTP [...] rt.cpan.org |
From: | Athanasius <perl [...] miggy.org> |
Using latest POE from CPAN:
POE v1.367
POE::Component::Client::HTTP v0.949
This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi
which is from Debian 'wheezy' (version 7), but note the POE modules are
all up to date using cpanp in /usr/local/.
The first indication of the issue was when implementing some URL parsing
in an IRC bot and finding it took around a minute to retrieve
https://community.elitedangerous.com/galnet/uid/56a60d089657ba197a730a88
.
Further testing showed the weird behaviour of the perl side of things
not having read all the data yet, but select saying "nothing ready to
read", yet strace on the process shows read(2) called for the entire
amount. After much head-scratching I can only come to the conclusion
that the way POE employs SSL means that select() is on the raw file
descriptor and indeed returns "nothing to read", when perl's sysread()
hasn't actually yet read all the decrypted data.
Note that the "about a minute" seems to be the particular web server's
timeout for keep-alive. As such this issue can be worked around by
sending a header "Connection: close" to disable keep-alive. Then the
connection close event causes perl select() to cause more sysread()
calls and the entire data to be read. I'm not sure if enough of a
plaintext/encrypted byte count mismatch could cause data to be failed to
be read at all, or if the connection having been closed will keep
select() tickling sysread() into action until it's all read.
The example code is very simple, it really is just that SSL
encrypted/plaintext mismatch combined with a long timeout keep-alive
server:
---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<
#!/usr/bin/perl -w
# vim: textwidth=0 wrapmargin=0 shiftwidth=2 tabstop=2 expandtab
use strict;
use POE;
use POE::Component::Client::HTTP;
use HTTP::Request;
use POSIX qw/strftime/;
POE::Component::Client::HTTP->spawn(
Alias => 'bugtest',
Agent => 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36',
);
POE::Session->create(inline_states => {
_start => sub { $_[KERNEL]->yield('request_galnet') },
request_galnet => \&request_galnet,
response => \&response_handler }
);
sub response_handler {
my ($req, $response) = @_[ARG0, ARG1];
my $res = $response->[0];
if (! $res->is_success) {
my $error = "Failed to retrieve URL - ";
if (defined($res->header('X-PCCH-Errmsg')) and $res->header('X-PCCH-Errmsg') =~ /Connection to .* failed: [^\s]+ error (?<errornum>\?\?|[0-9]]+): (?<errorstr>.*)$/) {
$error .= $+{'errornum'} . ": " . $+{'errorstr'};
} else {
$error .= $res->status_line;
}
warn(strftime("%Y-%m-%d %H:%M:%S %z - ", localtime(time())), $error);
return undef;
}
if ($res->header('Content-Type') =~ /^text\/(ht|x)ml/) {
print(strftime("%Y-%m-%d %H:%M:%S %z - ", localtime(time())), "URL successfully retreived\n");
exit(0);
} else {
print(strftime("%Y-%m-%d %H:%M:%S %z - ", localtime(time())), "That was not an HTML page\n");
}
}
sub request_galnet {
my $req = HTTP::Request->new('GET', 'https://community.elitedangerous.com/galnet/uid/56a60d089657ba197a730a88'); #, [ "Connection" => "close" ]);
print(strftime("%Y-%m-%d %H:%M:%S %z - ", localtime(time())), "Posting request...\n");
$_[KERNEL]->post( 'bugtest', 'request', 'response', $req);
}
POE::Kernel->run();
---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<
Presumably some POE code needs adjusting such that when SSL is employed
select() will still return "data to read" on the affected file
descriptor(s).
In detail, after much debugging...
1) POE/Driver/SysRW.pm get() calls sysread(). In testing the
last call to this before the pause was for 4096 bytes, and it
got that many. A strace, along with POE debug warn()s enabled
shows that by this point an actual read(2) has read all the data
(the subsequent read(2) when the server closes the connection
receives zero bytes for EOF).
2) That was set up to be called from POE/Wheel/ReadWrite.pm
_define_read_state().
3) Presumably this then relies on POE/Loop/Select.pm
loop_do_timeslice() detecting that the relevant file descriptor
is ready for sysread() to call the in-line subroutine containing
the ->get() invocation, and loop to the filter/handler.
and all of that is assuming that it's actually sysread() getting called,
and not something that the SSL-ification has intercepted.
--
- Athanasius = Athanasius(at)miggy.org / http://www.miggy.org/
Finger athan(at)fysh.org for PGP key
"And it's me who is my enemy. Me who beats me up.
Me who makes the monsters. Me who strips my confidence." Paula Cole - ME