Subject: | CORE::waitpid in Psh/OS/Unix.pm problem killing jobcontrol (w work-around) |
This is a dupe of sf.net 2869783, as I don't know, what tracker is still
active.
Details:
platform: linux ubuntu jaunty amd64
issue: job control is beyond broken and I see spurious returns
of waitpid with $?==0 instead of the proper non-zero $? for
a still existing child. At the user level, this translates to
still running bg processes escaping from job control, making
jobcontrol unusable.
Patch vs. Ubuntu jaunty's version 1.8-9 (which is 1.8.0 or 1.8.1):
--- Unix.pm.org 2009-09-02 16:58:44.963742654 +0200
+++ Unix.pm 2009-09-02 19:51:34.848018995 +0200
@@ -216,13 +216,30 @@
my $status=1;
my $returnpid;
while (1) {
+#warn "$$: pid: $pid jobldr:". ($job->{pgrp_leader})
."running:".($job->{running})." psh_pgrp:$psh_pgrp\n";
if (!$job->{running}) { $job->continue; }
{
local $Psh::currently_active = $pid;
$returnpid = CORE::waitpid($pid,POSIX::WUNTRACED());
$pid_status = $?;
+#warn "rc=$? - $pid - $returnpid\n";
}
last if $returnpid<1;
+
+# PJ I see spurious returns of waitpid with 0==$? instead of the
+# real STOP so we do an additional waitpid if indeed it did exit
+if ($returnpid==$pid and $pid_status==0) {
+ my $rpid = CORE::waitpid($pid,POSIX::WNOHANG());
+ if ($rpid==$pid) { # we have a change, so
+ $pid_status=$?; # retain the new status
+ }
+ if ($rpid!=-1) { # provide the lost status for STOP: 0x137f
+ not $pid_status and $pid_status= ( POSIX::SIGSTOP()<<8 ) + ( -1 &
0x7f );
+ }
+ # CORE::warn("rpid:$rpid pid:$pid ?:$? pid_status:$pid_status\n");
+}
+
+
# Very ugly work around for the problem that
# processes occasionally get SIGTTOUed without reason