Skip Menu |

This queue is for tickets about the POE CPAN distribution.

Report information
The Basics
Id: 39872
Status: resolved
Priority: 0/
Queue: POE

People
Owner: Nobody in particular
Requestors: perl [...] pied.nu
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.003
Fixed in: (no value)



Subject: USE_SIGCHLD and fork
There is an unfortunate interaction between USE_SIGCHLD and fork(). If USE_SIGCHLD is set true, and a process spawns child processes that inherit the same kernel (for example: a forking server), the sub-process will inherit the parent's $kr_child_procs (see POE::Resource::Signals). If no furthur sub-processes are created in the child, $kr_child_procs will never be reset. The child process's kernel will become idle. A quick work around is to call $kernel->_data_sig_initialize in the child process, after the fork. A better idea would be to make a new $kernel->fork(), which would : a- resets the kernel ID; b- call _data_sig_initialize; c- post a _forked signal (say); d- ????; e- PROFIT!
I'm worried that either suggestion requires applications to call some POE::Kernel method. Perhaps it might be more robust for POE::Kernel to passively notice that $$ has changed and do the right thing. Problem: Overhead constantly checking $$. Maybe the checks can go somewhere seldom and strategic?
Furthur to my ranting last night on IRC, I've got a patch that fixes the new problem : In POE 1.003, even with Loop::Event, USE_SIGCHLD will use $SIG{CHLD} to catch CHLD. At shutdown, loop_ignore_signal (called from loop_ignore_all_signals) will set $SIG{CHLD} = 'DEFAULT'. If kr_child_procs == 1, this will prevent the kernel from ever exiting! The included patch modifies loop_ignore_signal to prevent it from resetting CHLD if kr_child_procs == 1. This might be a low impact fix to the entire problem. One thing that is bothersome is that loop_ignore_signal( 'CHLD' ) is also called at start-up (I think), but at that point, kr_child_procs should never be 1.
--- lib/POE/Loop/PerlSignals.pm 2008-05-25 19:01:59.000000000 -0400 +++ /usr/lib/perl5/site_perl/5.8.8/POE/Loop/PerlSignals.pm 2009-02-25 04:16:08.000000000 -0500 @@ -99,10 +99,22 @@ sub loop_ignore_signal { my ($self, $signal) = @_; - delete $signal_watched{$signal}; - - unless ( USE_SIGCHLD ) { if ($signal eq 'CHLD' or $signal eq 'CLD') { + if ( USE_SIGCHLD ) { + if( $self->_data_sig_child_procs) { + # We need SIGCHLD to stay around after shutdown, so that + # child processes may be reaped and kr_child_procs=0 + if (TRACE_SIGNALS) { + POE::Kernel::_warn "<sg> Keeping SIG$signal anyway!"; + } + return; + } + } + else { + delete $signal_watched{$signal}; + if (TRACE_SIGNALS) { + POE::Kernel::_warn "<sg> Cease polling SIG$signal"; + } $self->_data_sig_cease_polling(); # We should never twiddle $SIG{CH?LD} under poe, unless we want to # override system() and friends. --hachi @@ -110,13 +122,18 @@ return; } } + delete $signal_watched{$signal}; + + my $state = 'DEFAULT'; if ($signal eq 'PIPE') { - $SIG{$signal} = "IGNORE"; - return; + $state = 'IGNORE'; } - $SIG{$signal} = "DEFAULT"; + if (TRACE_SIGNALS) { + POE::Kernel::_warn "<sg> $state SIG$signal"; + } + $SIG{$signal} = $state; } sub loop_ignore_all_signals {
Your patch passes "make test", but I don't have a test case that proves it fixes the problem you've seen. Can you provide a simple Test::More test I can add to POE::Test::Loops? Thanks!
On Thu Feb 26 00:16:51 2009, RCAPUTO wrote: Show quoted text
> Can you provide a simple Test::More test I can add to POE::Test::Loops?
Test case would be fork 2 children, one exits (SIGCHLD, kr_child_procs is now 1), shutdown .... HANG. So failure of the test is it doesn't exit.
Second patch, against svn. It can be confirmed to work by commenting out the return on line 112 of POE/Loop/PerlSignal.pm.
Index: t/90_regression/rt39872-sigchld.t =================================================================== --- t/90_regression/rt39872-sigchld.t (revision 0) +++ t/90_regression/rt39872-sigchld.t (revision 0) @@ -0,0 +1,167 @@ +#!/usr/bin/perl + +use strict; +use warnings; + +sub DEBUG () { 0 } +sub POE::Kernel::USE_SIGCHLD () { 1 } +sub POE::Kernel::TRACE_SIGNALS () { 0 } + + +use POE; +use Test::More; +use POE::Wheel::Run; +use POSIX qw( SIGINT ); + +if ($^O eq "MSWin32") { + plan skip_all => "Test not working on $^O"; + exit 0; +} + +plan tests => 5; + + +######################################################################### +POE::Session->create( + inline_states => { + _start => \&_start, + _stop => \&_stop, + stdout => \&stdout, + stdout2 => \&stdout2, + stderr => \&stderr, + sig_CHLD => \&sig_CHLD, + error => \&error, + done => \&done + } + ); + +$poe_kernel->run; +pass( "Sane exit" ); + +######################################################################### +sub _start +{ + my( $kernel, $heap ) = @_[KERNEL, HEAP]; + + my $prog = <<'PERL'; +$|++; +my $N = shift; +print "I am $N\n"; +while(<STDIN>) { + chomp; + exit 0 if /^bye/; + print "Unknown command '$_'\n"; +} +PERL + + DEBUG and warn "_start"; + $kernel->alias_set( 'worker' ); + $kernel->sig( CHLD => 'sig_CHLD' ); + + $heap->{W1} = POE::Wheel::Run->new( + Program => [ $^X, '-e', $prog, "W1" ], + StdoutEvent => 'stdout', + StderrEvent => 'stderr', + ErrorEvent => 'error' + ); + $heap->{id2W}{ $heap->{W1}->ID } = 'W1'; + $heap->{pid2W}{ $heap->{W1}->PID } = 'W1'; + + $heap->{W2} = POE::Wheel::Run->new( + Program => [ $^X, '-e', $prog, "W2" ], + StdoutEvent => 'stdout', + StderrEvent => 'stderr', + ErrorEvent => 'error' + ); + $heap->{id2W}{ $heap->{W2}->ID } = 'W2'; + $heap->{pid2W}{ $heap->{W2}->PID } = 'W2'; + +} + +######################################## +sub _stop +{ + my( $kernel, $heap ) = @_[KERNEL, HEAP]; + DEBUG and warn "_stop"; +} + +######################################## +sub done +{ + my( $kernel, $heap ) = @_[KERNEL, HEAP]; + DEBUG and warn "done"; + + $kernel->alias_remove( 'worker' ); + $kernel->sig( 'CHLD' ); + + delete $heap->{W1}; + delete $heap->{W2}; + + my @list = keys %{ $heap->{pid2W} }; + is( 0+@list, 1, "One wheel left" ); + kill SIGINT, @list; + + diag( "Unfortunately, this test will hang as a failure case" ); +# diag( "But during success, you will see !!! Child process PID:$list[0] reaped:"); +} + + + +######################################## +sub stdout +{ + my( $kernel, $heap, $input, $id ) = @_[KERNEL, HEAP, ARG0, ARG1]; + my $N = $heap->{id2W}{$id}; + DEBUG and warn "Input $N ($id): '$input'"; + my $wheel = $heap->{ $N }; + ok( ($input =~ /I am $N/), "Intro output" ); + if( $N eq 'W1' ) { + $heap->{closing}{ $N } = 1; + $wheel->put( 'bye' ); + } +} + +######################################## +sub stderr +{ + my( $kernel, $heap, $input, $id ) = @_[KERNEL, HEAP, ARG0, ARG1]; + my $N = $heap->{id2W}{$id}; + DEBUG and warn "Error $N ($id): '$input'"; +} + +######################################## +sub error +{ + my( $kernel, $heap, $op, $errnum, $errstr, $id, $fh ) = + @_[ KERNEL, HEAP, ARG0..$#_ ]; + + my $N = $heap->{id2W}{$id}; + DEBUG and warn "Error $N ($id): $op $errnum ($errstr)"; + my $wheel = $heap->{ $N }; + + if( $op eq 'read' and $errnum==0 ) { + # normal exit + } + else { + die "Error $N ($id): $op $errnum ($errstr)"; + } +} + +######################################## +sub sig_CHLD +{ + my( $kernel, $heap, $signal, $pid, $status ) = + @_[ KERNEL, HEAP, ARG0..$#_ ]; + + my $N = $heap->{pid2W}{$pid}; + DEBUG and warn "CHLD $N ($pid)"; + my $wheel = $heap->{ $N }; + + is( $heap->{closing}{$N}, 1, "$N closing" ); + + delete $heap->{closing}{$N}; + delete $heap->{pid2W}{$pid}; + delete $heap->{$N}; + delete $heap->{id2W}{ $wheel->ID }; + $kernel->yield( 'done' ); +} Index: lib/POE/Loop/PerlSignals.pm =================================================================== --- lib/POE/Loop/PerlSignals.pm (revision 2469) +++ lib/POE/Loop/PerlSignals.pm (working copy) @@ -101,8 +101,17 @@ delete $signal_watched{$signal}; - unless ( USE_SIGCHLD ) { if ($signal eq 'CHLD' or $signal eq 'CLD') { + if ( USE_SIGCHLD ) { + if( $self->_data_sig_child_procs) { + # We need SIGCHLD to stay around after shutdown, so that + # child processes may be reaped and kr_child_procs=0 + if (TRACE_SIGNALS) { + POE::Kernel::_warn "<sg> Keeping SIG$signal anyway!"; + } + return; + } + } else { $self->_data_sig_cease_polling(); # We should never twiddle $SIG{CH?LD} under poe, unless we want to # override system() and friends. --hachi @@ -111,12 +120,17 @@ } } + delete $signal_watched{$signal}; + + my $state = 'DEFAULT'; if ($signal eq 'PIPE') { - $SIG{$signal} = "IGNORE"; - return; + $state = "IGNORE"; } - $SIG{$signal} = "DEFAULT"; + if (TRACE_SIGNALS) { + POE::Kernel::_warn "<sg> $state SIG$signal"; + } + $SIG{$signal} = $state; } sub loop_ignore_all_signals {
Thank you for the test case and patch. The most recent ones were applied as revision 2470. They will be in POE 1.003_03.
RT-Send-CC: RCAPUTO [...] cpan.org
FWIW, I wrote another test case that tests the problem as originaly reported, that is $kr_child_procs being inherited by the kernel when forking. And it shows me that my fix fixed both problems. So maybe this was a wasted effort. Or maybe more test cases makes everyone happy.
Index: t/90_regression/rt39872-A.t =================================================================== --- t/90_regression/rt39872-A.t (revision 0) +++ t/90_regression/rt39872-A.t (revision 0) @@ -0,0 +1,164 @@ +#!/usr/bin/perl + +use strict; +use warnings; + +sub DEBUG () { 0 } +my $REFCNT; +sub POE::Kernel::USE_SIGCHLD () { 1 } +sub POE::Kernel::ASSERT_DEFAULT () { 0 } +sub POE::Kernel::TRACE_SIGNALS () { 0 } +sub POE::Kernel::TRACE_REFCNT () { DEBUG and $REFCNT } + +use POE; +use Test::More; +use POE::Wheel::Run; +use POSIX qw( SIGINT SIGUSR1 ); + +if ($^O eq "MSWin32") { + plan skip_all => "Test not working on $^O"; + exit 0; +} + +$SIG{__WARN__} = sub { + print STDERR "$$: $_[0]"; + }; + + +plan tests => 3; + +our $PARENT = 1; + +POE::Session->create( + inline_states => { + _start => \&_start, + _stop => \&_stop, + + work => \&work, + child => \&child, + parent => \&parent, + T1 => \&T1, + T2 => \&T2, + + sig_CHLD => \&sig_CHLD, + sig_USR1 => \&sig_USR1, + done => \&done + } +); + +DEBUG and warn "Parent"; +$poe_kernel->run; +pass( "Sane exit" ) if $PARENT; +DEBUG and warn "Exit"; +exit; + + +sub _start { + my( $kernel, $heap ) = @_[KERNEL, HEAP]; + + DEBUG and warn "_start"; + $kernel->alias_set( 'worker' ); + $kernel->sig( CHLD => 'sig_CHLD' ); + + $kernel->yield( 'work' ); +} + +sub work { + my( $kernel, $heap, $session ) = @_[KERNEL, HEAP, SESSION]; + + foreach my $name ( qw( T1 T2 ) ) { + my $pid = fork; + die "Unable to fork: $!" unless defined $pid; + + if( $pid ) { # parent + $heap->{$name}{PID} = $pid; + $heap->{pid2N}{ $pid } = $name; + } + else { + $kernel->yield( 'child' ); + return; + } + } + foreach my $name ( qw( T1 T2 ) ) { + $kernel->refcount_increment( $session->ID, $name ); + } + + $kernel->delay_add( 'parent', 3 ); + diag( "Wait 3" ); + + return; +} + +sub parent +{ + my( $kernel, $heap, $session ) = @_[KERNEL, HEAP, SESSION]; + DEBUG and warn "parent"; + kill SIGUSR1, $heap->{T1}{PID}; + $heap->{T1}{closing} = 1; +} + + +sub child +{ + my( $kernel, $heap, $session ) = @_[KERNEL, HEAP, SESSION]; + $PARENT = 0; + DEBUG and warn "child"; + $kernel->sig( 'CHLD' ); + $kernel->sig( USR1 => 'sig_USR1' ); + $kernel->refcount_increment( $session->ID, 'USR1' ); +} + +sub sig_USR1 +{ + my( $kernel, $heap, $session ) = @_[KERNEL, HEAP, SESSION]; + DEBUG and warn "USR1"; + $REFCNT = 1; + $kernel->sig( 'USR1' ); + $kernel->refcount_decrement( $session->ID, 'USR1' ); + $kernel->alias_remove( 'worker' ); +} + + + + + +sub _stop { + my( $kernel, $heap ) = @_[KERNEL, HEAP]; + DEBUG and warn "_stop"; +} + +sub done { + my( $kernel, $heap, $session ) = @_[KERNEL, HEAP, SESSION]; + DEBUG and warn "done"; + + $kernel->alias_remove( 'worker' ); + $kernel->sig( 'CHLD' ); + + my @list = keys %{ $heap->{pid2N} }; + is( 0+@list, 1, "One child left ($list[0])" ); + kill SIGUSR1, @list; + + $REFCNT = 1; + $kernel->refcount_decrement( $session->ID, 'T2' ); + alarm(30); $SIG{ALRM} = sub { die "test case didn't end sanely" }; +} + +sub sig_CHLD { + my( $kernel, $heap, $signal, $pid, $status ) = @_[ + KERNEL, HEAP, ARG0..$#_ + ]; + + unless( $heap->{pid2N}{$pid} ) { + return; + } + + my $name = $heap->{pid2N}{$pid}; + my $D = $heap->{$name}; + + is( $D->{closing}, 1, "Expected child exited" ); + $kernel->refcount_decrement( $_[SESSION]->ID, $name ); + delete $heap->{$name}; + delete $heap->{pid2N}{$pid}; + + $kernel->yield( 'done' ); +}
I didn't want to reopen this ticket.
I've opened bug #45109 which is a case where the test for this bug is failing.