Skip Menu |

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 22839
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: mark.zealey [...] pipex.net
Cc: SREZIC [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in:
  • 5.805
  • 5.835
Fixed in: (no value)



Subject: Bug with SSL timeouts in 5.805
Date: Mon, 6 Nov 2006 11:04:58 -0000
To: <bug-libwww-perl [...] rt.cpan.org>
From: "Mark Zealey" <mark.zealey [...] pipex.net>
Hi there, I've discovered that if I try to connect using Net::HTTPS to a machine that is down, the timeout value is ignored. This is because during the IO::Socket::INET->configure() stage, it tries to connect() to the remote machine. IO::Socket->connect() is called for this, and if there is a timeout specified, it tries to connect in a non-blocking manner in order to work out the timeout. Net::HTTPS:53 redefines blocking() to a noop explaining that the underlying SSL classes don't work if the socket is placed in non-blocking mode. I'm not entirely sure on the fix for this, but perhaps a warning at least should be issued from the subroutine and documented somewhere? I suppose the way to correctly handle this situation is to connect to have some test to see if the remote machine is responding, but this is difficult if the firewall only permits https access, and disallows ping's, for example. I guess in that case, the best way is to try a plain IO::Socket() connect to the port first, and then reconnect using the full Net::HTTPS module. Do you have any better suggestions? Thanks, Mark Zealey -- Mark Zealey -- Reseller & Dedicated Servers Developer Product Development * Pipex Hosting mark.zealey@pipex.net This mail is subject to this disclaimer: http://www.pipex.net/disclaimer.html
On Mon Nov 06 06:24:28 2006, mark.zealey@pipex.net wrote: Show quoted text
> Hi there, > > I've discovered that if I try to connect using Net::HTTPS to a machine > that is down, the timeout value is ignored. This is because during the > IO::Socket::INET->configure() stage, it tries to connect() to the remote > machine. IO::Socket->connect() is called for this, and if there is a > timeout specified, it tries to connect in a non-blocking manner in order > to work out the timeout. Net::HTTPS:53 redefines blocking() to a noop > explaining that the underlying SSL classes don't work if the socket is > placed in non-blocking mode. I'm not entirely sure on the fix for this, > but perhaps a warning at least should be issued from the subroutine and > documented somewhere? I suppose the way to correctly handle this > situation is to connect to have some test to see if the remote machine > is responding, but this is difficult if the firewall only permits https > access, and disallows ping's, for example. I guess in that case, the > best way is to try a plain IO::Socket() connect to the port first, and > then reconnect using the full Net::HTTPS module. Do you have any better > suggestions? >
It seems that this problem still persists (just tried in on FreeBSD using LWP 5.823). I found a comprehensive bug report with a fix suggestion in the redhat bug tracker: https://bugzilla.redhat.com/show_bug.cgi?id=460716 Regards, Slaven
On Sun Feb 01 16:47:52 2009, SREZIC wrote: Show quoted text
> On Mon Nov 06 06:24:28 2006, mark.zealey@pipex.net wrote:
> > Hi there, > > > > I've discovered that if I try to connect using Net::HTTPS to a machine > > that is down, the timeout value is ignored. This is because during the > > IO::Socket::INET->configure() stage, it tries to connect() to the remote > > machine. IO::Socket->connect() is called for this, and if there is a > > timeout specified, it tries to connect in a non-blocking manner in order > > to work out the timeout. Net::HTTPS:53 redefines blocking() to a noop > > explaining that the underlying SSL classes don't work if the socket is > > placed in non-blocking mode. I'm not entirely sure on the fix for this, > > but perhaps a warning at least should be issued from the subroutine and > > documented somewhere? I suppose the way to correctly handle this > > situation is to connect to have some test to see if the remote machine > > is responding, but this is difficult if the firewall only permits https > > access, and disallows ping's, for example. I guess in that case, the > > best way is to try a plain IO::Socket() connect to the port first, and > > then reconnect using the full Net::HTTPS module. Do you have any better > > suggestions? > >
> > It seems that this problem still persists (just tried in on FreeBSD > using LWP 5.823). I found a comprehensive bug report with a fix > suggestion in the redhat bug tracker: > https://bugzilla.redhat.com/show_bug.cgi?id=460716 >
It seems that the problem can be reproduced with this script (tested on Debian, RedHat, and FreeBSD with different perl and LWP versions): #!perl use strict; use LWP::UserAgent; #use Net::HTTPS;use Sub::Delete;delete_sub 'Net::HTTPS::blocking'; my $ua = LWP::UserAgent->new; $ua->timeout(1); my $resp = $ua->get("https://www.example.com"); warn $resp->as_string; __END__ If you activate the monkeypatch (the line with the delete_sub), then the timeout works as expected. So the attached patch could solve the problem. Regards, Slaven
From 4199a89d8c5e8500aee31957f708a58c20594bcb Mon Sep 17 00:00:00 2001 From: Slaven Rezic <srezic@iconmobile.com> Date: Wed, 4 Feb 2009 12:43:39 +0100 Subject: [PATCH] https timeout fix, applied (one) patch proposed in https://bugzilla.redhat.com/show_bug.cgi?id=460716 --- lib/Net/HTTPS.pm | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/Net/HTTPS.pm b/lib/Net/HTTPS.pm index bfed714..6361024 100644 --- a/lib/Net/HTTPS.pm +++ b/lib/Net/HTTPS.pm @@ -54,6 +54,6 @@ sub http_default_port { # The underlying SSLeay classes fails to work if the socket is # placed in non-blocking mode. This override of the blocking # method makes sure it stays the way it was created. -sub blocking { } # noop +#sub blocking { } # noop 1; -- 1.5.6.5
I'm still reluctant as I don't understand under what conditions this breaks stuff. What combinations of LWP and SSL modules would not be able to process https:// URL if I applied this patch.
On Wed Feb 04 07:55:37 2009, GAAS wrote: Show quoted text
> I'm still reluctant as I don't understand under what conditions this > breaks stuff. What > combinations of LWP and SSL modules would not be able to process > https:// URL if I applied > this patch.
At least the timeout issue happens with both IO::Socket::SSL and Net::SSL. See the attached script with some results; I checked an older and the current version of IO::Socket::SSL and the current version of Net::SSL. Of course this is only the timeout case. I don't know if something else may happen during "normal" https transmission. I can only volunteer to put the monkeypatch in our company's servers and see if everything works OK. Regards, Slaven
#!perl use strict; no strict 'refs'; use LWP::UserAgent; use Net::HTTPS; warn "Net::HTTPS::SSL_SOCKET_CLASS=$Net::HTTPS::SSL_SOCKET_CLASS Version " . ${$Net::HTTPS::SSL_SOCKET_CLASS . "::VERSION"} . "\n"; #use Sub::Delete;delete_sub 'Net::HTTPS::blocking'; # <-- toggle this line my $ua = LWP::UserAgent->new; $ua->timeout(1); my $resp = $ua->get("https://www.example.com"); warn $resp->as_string; __END__ =pod Summary | SSL Socket class | with blocking() | blocking() deleted | |----------------------+-----------------+--------------------| | IO::Socket::SSL 1.02 | 189s | ~1s | | IO::Socket::SSL 1.22 | 189s | 1.148s | | Net::SSL 2.84 | 189s | 1.142s | With Net::HTTPS::blocking deleted and no Crypt::SSLeay installed: (answer after about one second): Net::HTTPS::SSL_SOCKET_CLASS=IO::Socket::SSL Version 1.02 500 Can't connect to www.example.com:443 (connect: timeout) Content-Type: text/plain Client-Date: Tue, 10 Feb 2009 10:03:08 GMT Client-Warning: Internal response 500 Can't connect to www.example.com:443 (connect: timeout) With Net::HTTPS::blocking not deleted: Net::HTTPS::SSL_SOCKET_CLASS=IO::Socket::SSL Version 1.02 500 Can't connect to www.example.com:443 (connect: Connection timed out) Content-Type: text/plain Client-Date: Tue, 10 Feb 2009 10:07:06 GMT Client-Warning: Internal response 500 Can't connect to www.example.com:443 (connect: Connection timed out) perl /tmp/lwp.pl 0,14s user 0,01s system 0% cpu 3:09,15 total Upgrade to IO::Socket::SSL 1.22, ::blocking deleted: Net::HTTPS::SSL_SOCKET_CLASS=IO::Socket::SSL Version 1.22 500 Can't connect to www.example.com:443 (connect: timeout) Content-Type: text/plain Client-Date: Tue, 10 Feb 2009 10:09:17 GMT Client-Warning: Internal response 500 Can't connect to www.example.com:443 (connect: timeout) perl /tmp/lwp.pl 0,13s user 0,02s system 12% cpu 1,148 total IO::Socket::SSL 1.22, ::blocking not deleted: Net::HTTPS::SSL_SOCKET_CLASS=IO::Socket::SSL Version 1.22 500 Can't connect to www.example.com:443 (connect: Connection timed out) Content-Type: text/plain Client-Date: Tue, 10 Feb 2009 10:12:47 GMT Client-Warning: Internal response 500 Can't connect to www.example.com:443 (connect: Connection timed out) perl /tmp/lwp.pl 0,13s user 0,02s system 0% cpu 3:09,16 total Crypt-SSLeay-0.57 installed, ::blocking deleted: Net::HTTPS::SSL_SOCKET_CLASS=Net::SSL Version 2.84 500 Connect failed: connect: timeout; Loš opisnik datoteke Content-Type: text/plain Client-Date: Tue, 10 Feb 2009 10:34:07 GMT Client-Warning: Internal response 500 Connect failed: connect: timeout; Loš opisnik datoteke perl /tmp/lwp.pl 0,11s user 0,01s system 10% cpu 1,142 total Crypt-SSLeay-0.57 installed, ::blocking not deleted: Net::HTTPS::SSL_SOCKET_CLASS=Net::SSL Version 2.84 500 Connect failed: connect: Connection timed out; Connection timed out Content-Type: text/plain Client-Date: Tue, 10 Feb 2009 10:37:32 GMT Client-Warning: Internal response 500 Connect failed: connect: Connection timed out; Connection timed out perl /tmp/lwp.pl 0,10s user 0,02s system 0% cpu 3:09,13 total =cut
On Tue Feb 10 06:19:02 2009, SREZIC wrote: Show quoted text
> On Wed Feb 04 07:55:37 2009, GAAS wrote:
> > I'm still reluctant as I don't understand under what conditions this > > breaks stuff. What > > combinations of LWP and SSL modules would not be able to process > > https:// URL if I applied > > this patch.
> > At least the timeout issue happens with both IO::Socket::SSL and > Net::SSL. See the attached script with some results; I checked an older > and the current version of IO::Socket::SSL and the current version of > Net::SSL. > > Of course this is only the timeout case. I don't know if something else > may happen during "normal" https transmission. I can only volunteer to > put the monkeypatch in our company's servers and see if everything
works OK. Show quoted text
>
Well, the experiment failed. I had at least one case where HTTPS connections were completely broken with the patch. All I got was a "500 read failed". So another solution is necessary. Regards, Slaven
I tried the Sub::Delete hack in a test with Test::WWW::Mechanize. Every reachable url failed immediately with 500/"Read failed: " (blank), until the one that was down, where the timeout worked correctly, but it failed with 500/Connect failed: connect: timeout; Connection timed out" then all subsequent attempts to get reachable https urls failed immediately with a mix of 500/"Connect failed: connect: Connection refused; Connection refused" and 500/"Read failed: " errors. Weird. It did not matter whether I re-used the Test::WWW::Mechanize (LWP::UserAgent) object, or whether I re-created it for each test. The errors were the same with the Sub::Delete hack. If I do not use Sub::Delete on Net::HTTPS::blocking, it works fine, it just takes forever. But the curious thing is, it only takes the full timeout on the first attempt to get an https:// url that fails. Subsequent attempts to get other https:// urls where no server is listening fail correctly and immediately with 500/"Connect failed: connect: Connection refused; Connection refused", while valid ones still work fine. My script is looping through server names with foreach. I tried using: eval { local $SIG{ALRM} = sub { die "blah\n" }; alarm $timeout; $mech->get_ok("https://notworking.somevalidfqdn.com"); alarm 0; }; before the loop. That seems to work, but subsequent attempts to fetch non-working https:// urls still engage the timeout (unless they are alarmed too). However, if I do it this way: TRY_HTTPS_HOST: for my $host (@hosts) { local $SIG{ALRM} = sub { next TRY_HTTPS_HOST }; alarm $timeout; $mech->get_ok("https://$host"); alarm 0; } Then on the first url it reaches which cannot connect, that fails after the alarm timeout with 500/"Exiting subroutine via next"... but then the strange thing is, all subsequent attempts to get non-working urls fail immediately (no timeout waiting) with 500/"Connect failed: connect: Connection refused; Connection refused", and all attempts to get working urls pass ok. Weird. Something about using next from the alarm sub on the first bad url sets something right, so that everything works like you'd expect it to from then on. Mark
BTW This is perl 5.8.8 in CentOS with LWP::UserAgent 5.835.
I'm lame, the alarm is a good way out if you don't use eval. This seems to work, though I'm not sure why it doesn't completely die, but continues to work. for my $host (@hosts) { my $url = "https://$host"; local $SIG{ALRM} = sub { die "get $url failed on alarm\n" }; alarm 5; $mech->get_ok($url); warn "passed get_ok for $url\n"; $alarm 0; my $response = $mech->response; ok($response->code == 200, 'foo'); # ... } And, it only alarms out on the first try. Subsequent iterations either work for urls that work, or fail with 500/"Connect failed: connect: Connection refused; Connection refused" if the url is not listening. "passed get_ok for $url" always fires, so I guess the alarm is going off and dying from a point in get_ok() that is inside an eval, and that's why the string from die() turns up in the mech response. Anyway, it seems to be a good workaround. Mark
From: sdziegie [...] stronglg.demon.co.uk
Is there any consensus on the best way forward? I am seeing exactly the same problem with https timeouts ignoring $ua->timeout(5) and taking the full O/S timeout on the system connect() calls: 189 seconds on openSuSE 11.4 and 225 seconds on Solaris. http timeouts work as expected. I agree that the 'sub blocking { } # noop' in Net/HTTPS.pm is the cause and have found that commenting it out on openSuSE has the desired effect of timing out the https call to connect() after 5 seconds. I will try this out on Solaris at work tomorrow and hope to get the same effect. We use LWP::UserAgent to monitor a combination of http and https hosts and as things stand we only find out if a https host is down after 189/225 seconds and this is precisely the information that we would like to discover quickly. openSuSE 11.4, perl 5.12.3, LWP::UserAgent 5.835 Solaris 5.10, perl 5.6.1, LWP::UserAgent 2.001 (?) I am happy to keep investigating but would appreciate a steer in the correct direction. Thanks, Stanley
RT-Send-CC: SREZIC [...] cpan.org
On Wed Mar 04 10:40:47 2009, SREZIC wrote: Show quoted text
> Well, the experiment failed. I had at least one case where HTTPS > connections were completely broken with the patch. All I got was a "500 > read failed".
It would help if you would attach a test case for this, so any alternate solutions can be tested against the case.
A proposed fix is in cpan #72509