Skip Menu |

This queue is for tickets about the Net-Server CPAN distribution.

Report information
The Basics
Id: 85308
Status: open
Priority: 0/
Queue: Net-Server

People
Owner: Nobody in particular
Requestors: x.guimard [...] free.fr
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in:
  • 2.006
  • 2.007
Fixed in: (no value)



Subject: applications die() because of SIGCHLD vs. SIGCLD confusion
Hi, a Debian user reports the following (http://bugs.debian.org/708180): From: "Steinar H. Gunderson" <sgunderson@bigfoot.com> To: Debian Bug Tracking System <submit@bugs.debian.org> Subject: applications die() because of SIGCHLD vs. SIGCLD confusion Date: Mon, 13 May 2013 21:28:14 +0200 Package: libnet-server-perl Version: 2.006-1 Severity: grave Hi, I have a starman application that regularly dies under load with a backtrace like: Message: Can't use string ("") as a subroutine ref while "strict refs" in use at /usr/share/perl5/Net/Server/SIG.pm line 72, <$read> line 2002. at /usr/share/perl5/Net/Server/SIG.pm line 72 Net::Server::SIG::check_sigs() called at /usr/share/perl5/Net/Server/PreForkSimple.pm line 337 Net::Server::PreForkSimple::close_children('Starman::Server=HASH(0x84e7a8)') called at /usr/share/perl5/Net/Server.pm line 735 Net::Server::server_close('Starman::Server=HASH(0x84e7a8)') called at /usr/share/perl5/Starman/Server.pm line 124 (...) Some debugging shows that the problem is that several Net::Server modules (e.g. Net::Server::PreforkSimple) tries to trap signals with Net::Server::SIG using “CHLD” as the signal name. Net::Server::SIG in turn adds a signal handler like this: $SIG{$sig} = sub{ $Net::Server::SIG::_SIG{$_[0]} = 1; }; However, when that sub is called, it is called with “CLD” as signal, not “CHLD” (seemingly Perl has two names for this). This in turn confuses check_sigs, which wants to do this with $sig set to “CLD”: $_SIG_SUB{$sig}->($sig); whereupon the crash happens. I think the smallest fix around this is something like $sig = 'CLD' if ($sig eq 'CHLD'); in the top of register_sig(), but I might be mistaken. In any case, this makes Net::Server::PreforkSimple, and probably several others, rather unusable, since they may crash at almost any time, and there is no simple workaround that I can see. (Thus the RC severity.)
Subject: Re: [rt.cpan.org #85308] applications die() because of SIGCHLD vs. SIGCLD confusion
Date: Wed, 15 May 2013 07:07:19 -0600
To: bug-Net-Server [...] rt.cpan.org
From: Paul Seamons <paul [...] seamons.com>
Likely the better solution would be to change that code to this: $SIG{$sig} = sub{ $Net::Server::SIG::_SIG{$sig} = 1; }; That way, perl remains consistent and whatever awry value is being returned on your platform is ignored. I am a bit interested to know what platform you are running on. We have thousands of people running Net::Server on many platforms and you are the first to report the error. I personally have had some servers in use for over a decade, several getting millions of hits a month with spikes to hundreds of connections per second, but have not seen this issue. Either way - I think it is good to make the closure just depend on the value it already knows about at creation. I'm afraid changing it to explicitly look for CLD would break many platforms. So expect the fix in 2.008 which we'll release soon. Paul On 05/14/2013 09:35 PM, Xavier Guimard via RT wrote: Show quoted text
> Tue May 14 23:35:24 2013: Request 85308 was acted upon. > Transaction: Ticket created by GUIMARD > Queue: Net-Server > Subject: applications die() because of SIGCHLD vs. SIGCLD confusion > Broken in: 2.006, 2.007 > Severity: Important > Owner: Nobody > Requestors: x.guimard@free.fr > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=85308 > > > > Hi, > > a Debian user reports the following (http://bugs.debian.org/708180): > > From: "Steinar H. Gunderson" <sgunderson@bigfoot.com> > To: Debian Bug Tracking System <submit@bugs.debian.org> > Subject: applications die() because of SIGCHLD vs. SIGCLD confusion > Date: Mon, 13 May 2013 21:28:14 +0200 > > Package: libnet-server-perl > Version: 2.006-1 > Severity: grave > > Hi, > > I have a starman application that regularly dies under load with a backtrace like: > > Message: Can't use string ("") as a subroutine ref while "strict refs" in use at /usr/share/perl5/Net/Server/SIG.pm line 72, <$read> line 2002. > at /usr/share/perl5/Net/Server/SIG.pm line 72 > Net::Server::SIG::check_sigs() called at /usr/share/perl5/Net/Server/PreForkSimple.pm line 337 > Net::Server::PreForkSimple::close_children('Starman::Server=HASH(0x84e7a8)') called at /usr/share/perl5/Net/Server.pm line 735 > Net::Server::server_close('Starman::Server=HASH(0x84e7a8)') called at /usr/share/perl5/Starman/Server.pm line 124 > (...) > > Some debugging shows that the problem is that several Net::Server modules > (e.g. Net::Server::PreforkSimple) tries to trap signals with Net::Server::SIG > using “CHLD” as the signal name. Net::Server::SIG in turn adds a signal handler > like this: > > $SIG{$sig} = sub{ $Net::Server::SIG::_SIG{$_[0]} = 1; }; > > However, when that sub is called, it is called with “CLD” as signal, not > “CHLD” (seemingly Perl has two names for this). This in turn confuses > check_sigs, which wants to do this with $sig set to “CLD”: > > $_SIG_SUB{$sig}->($sig); > > whereupon the crash happens. > > I think the smallest fix around this is something like > > $sig = 'CLD' if ($sig eq 'CHLD'); > > in the top of register_sig(), but I might be mistaken. > > In any case, this makes Net::Server::PreforkSimple, and probably several others, > rather unusable, since they may crash at almost any time, and there is no simple > workaround that I can see. (Thus the RC severity.) > >
Subject: Bug#708180: Info received (Fwd: Re: [rt.cpan.org #85308] applications die() because of SIGCHLD vs. SIGCLD confusion)
Date: Thu, 16 May 2013 04:03:09 +0000
To: 708180 [...] bugs.debian.org, bug-Net-Server [...] rt.cpan.org
From: owner [...] bugs.debian.org (Debian Bug Tracking System)
Thank you for the additional information you have supplied regarding this Bug report. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): Debian Perl Group <pkg-perl-maintainers@lists.alioth.debian.org> If you wish to submit further information on this problem, please send it to 708180@bugs.debian.org. Please do not send mail to owner@bugs.debian.org unless you wish to report a problem with the Bug-tracking system. -- 708180: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=708180 Debian Bug Tracking System Contact owner@bugs.debian.org with problems
Subject: Re: [rt.cpan.org #85308] applications die() because of SIGCHLD vs. SIGCLD confusion
Date: Thu, 16 May 2013 08:57:57 +0200
To: 708180 [...] bugs.debian.org, bug-Net-Server [...] rt.cpan.org
From: "Steinar H. Gunderson" <sgunderson [...] bigfoot.com>
On Thu, May 16, 2013 at 05:58:14AM +0200, Xavier wrote: Show quoted text
> Likely the better solution would be to change that code to this: > > $SIG{$sig} = sub{ $Net::Server::SIG::_SIG{$sig} = 1; }; > > That way, perl remains consistent and whatever awry value is being > returned on your platform is ignored.
Maybe, but only if all callers are consistent in using either SIGCHLD or SIGCLD. If not, the signal handler will be overwritten, and then you are back to the same problem again. Unless that can never happen? Show quoted text
> I am a bit interested to know what platform you are running on. We have > thousands of people running Net::Server on many platforms and you are > the first to report the error. I personally have had some servers in > use for over a decade, several getting millions of hits a month with > spikes to hundreds of connections per second, but have not seen this issue.
Debian wheezy, amd64 (that's Perl v5.14.2). My guess is that possibly, Perl changed the name of the signal delivered, which is why it hasn't happened before, but that's just a guess. /* Steinar */ -- Homepage: http://www.sesse.net/
Subject: Re: [rt.cpan.org #85308] applications die() because of SIGCHLD vs. SIGCLD confusion
Date: Thu, 16 May 2013 07:58:03 -0600
To: bug-Net-Server [...] rt.cpan.org
From: Paul Seamons <paul [...] seamons.com>
I'm not sure why the name difference. But, either way, if there is going to now be a new name, then handling it with the: $SIG{$sig} = sub{ $Net::Server::SIG::_SIG{$sig} = 1; }; is a start, but if CLD is also necessary, then we'll need to just register both CHLD and CLD to handle children. Not much overhead - but it would handle all cases. But before I go fully down that road, can you try the above patched line on your system and see if you get proper behavior? Thank you. Paul On 05/16/2013 12:58 AM, Steinar H. Gunderson via RT wrote: Show quoted text
> Queue: Net-Server > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=85308 > > > On Thu, May 16, 2013 at 05:58:14AM +0200, Xavier wrote:
>> Likely the better solution would be to change that code to this: >> >> $SIG{$sig} = sub{ $Net::Server::SIG::_SIG{$sig} = 1; }; >> >> That way, perl remains consistent and whatever awry value is being >> returned on your platform is ignored.
> Maybe, but only if all callers are consistent in using either SIGCHLD or > SIGCLD. If not, the signal handler will be overwritten, and then you are back > to the same problem again. Unless that can never happen? >
>> I am a bit interested to know what platform you are running on. We have >> thousands of people running Net::Server on many platforms and you are >> the first to report the error. I personally have had some servers in >> use for over a decade, several getting millions of hits a month with >> spikes to hundreds of connections per second, but have not seen this issue.
> Debian wheezy, amd64 (that's Perl v5.14.2). My guess is that possibly, > Perl changed the name of the signal delivered, which is why it hasn't > happened before, but that's just a guess. > > /* Steinar */
Subject: Re: [rt.cpan.org #85308] applications die() because of SIGCHLD vs. SIGCLD confusion
Date: Sat, 18 May 2013 12:17:25 +0200
To: bug-SOAP-Lite [...] rt.cpan.org
From: Xavier <x.guimard [...] free.fr>
-------- Debian user response: -------- Subject: Re: [rt.cpan.org #85308] applications die() because of SIGCHLD vs. SIGCLD confusion Date : Sat, 18 May 2013 11:33:33 +0200 From : Steinar H. Gunderson <sgunderson@bigfoot.com> To : Xavier <x.guimard@free.fr> Copy to : 708180@bugs.debian.org On Sat, May 18, 2013 at 11:12:48AM +0200, Xavier wrote: Show quoted text
> is a start, but if CLD is also necessary, then we'll need to just > register both CHLD and CLD to handle children. Not much overhead - but > it would handle all cases.
As long as you have control over all Net::Server::SIG users, then sure, this is a possible solution (although IMHO inelegant). Show quoted text
> But before I go fully down that road, can you try the above patched line > on your system and see if you get proper behavior?
I tried it already before reporting the bug, and yes, it fixes it. /* Steinar */ -- Homepage: http://www.sesse.net/
RT-Send-CC: sgunderson [...] bigfoot.com, paul [...] seamons.com
On Thu May 16 08:58:24 2013, sgunderson@bigfoot.com wrote: Show quoted text
> My guess is that possibly, > Perl changed the name of the signal delivered, which is why it hasn't > happened before, but that's just a guess.
For the record, SIGCLD is mentioned in signal(7) man page, i.e. it's not only a Perl problem. All the best
RT-Send-CC: sgunderson [...] bigfoot.com, 708180 [...] bugs.debian.org, paul [...] seamons.com
I spent the last few hours to find this bug and analyze it. I am also using starman on debian linux wheezy. To me this seems to be a bug in perl itself. Try this code: perl -E 'my $x=sub {print "got @_\n"}; $SIG{CLD}=$x; $SIG{CHLD}=$x; exit 0 unless fork; select undef, undef, undef, 1;' I see "CLD" when I run this code on the system perl of wheezy as well as on a self-built perl 5.18.0. Now exchange the 2 %SIG assignments: perl -E 'my $x=sub {print "got @_\n"}; $SIG{CHLD}=$x; $SIG{CLD}=$x; exit 0 unless fork; select undef, undef, undef, 1;' and I get "CHLD". So, it seems the name that is used to set the first child handler in the lifetime of the program determines what is passed later to every signal handler. Further, if you print the signal names w/o prior setting any signal handler like: print "@{[keys %SIG]}\n" you'll see on perl 5.18 the signal names in varying order. On debian's perl 5.14 I always see the same order where CLD comes before CHLD. Now, in my case I use CGI::Compile together with Starman. This module tries to preserve signal handlers. It does that by something like local @SIG{keys %SIG}=values %SIG; Since the first assignment to the child sig-handler determines the signal's name and since CLD comes before CHLD this name is always CLD on debian wheezy. On my 5.18 it is sometimes CLD and sometimes CHLD. Torsten