Bug #50093 for psh: CORE::waitpid in Psh/OS/Unix.pm problem killing jobcontrol (w work-around)

Subject:

CORE::waitpid in Psh/OS/Unix.pm problem killing jobcontrol (w work-around)

This is a dupe of sf.net 2869783, as I don't know, what tracker is still active. Details: platform: linux ubuntu jaunty amd64 issue: job control is beyond broken and I see spurious returns of waitpid with $?==0 instead of the proper non-zero $? for a still existing child. At the user level, this translates to still running bg processes escaping from job control, making jobcontrol unusable. Patch vs. Ubuntu jaunty's version 1.8-9 (which is 1.8.0 or 1.8.1): --- Unix.pm.org 2009-09-02 16:58:44.963742654 +0200 +++ Unix.pm 2009-09-02 19:51:34.848018995 +0200 @@ -216,13 +216,30 @@ my $status=1; my $returnpid; while (1) { +#warn "$$: pid: $pid jobldr:". ($job->{pgrp_leader}) ."running:".($job->{running})." psh_pgrp:$psh_pgrp\n"; if (!$job->{running}) { $job->continue; } { local $Psh::currently_active = $pid; $returnpid = CORE::waitpid($pid,POSIX::WUNTRACED()); $pid_status = $?; +#warn "rc=$? - $pid - $returnpid\n"; } last if $returnpid<1; + +# PJ I see spurious returns of waitpid with 0==$? instead of the +# real STOP so we do an additional waitpid if indeed it did exit +if ($returnpid==$pid and $pid_status==0) { + my $rpid = CORE::waitpid($pid,POSIX::WNOHANG()); + if ($rpid==$pid) { # we have a change, so + $pid_status=$?; # retain the new status + } + if ($rpid!=-1) { # provide the lost status for STOP: 0x137f + not $pid_status and $pid_status= ( POSIX::SIGSTOP()<<8 ) + ( -1 & 0x7f ); + } + # CORE::warn("rpid:$rpid pid:$pid ?:$? pid_status:$pid_status\n"); +} + + # Very ugly work around for the problem that # processes occasionally get SIGTTOUed without reason