Skip Menu |

This queue is for tickets about the IO-Async CPAN distribution.

Report information
The Basics
Id: 127928
Status: open
Priority: 0/
Queue: IO-Async

People
Owner: Nobody in particular
Requestors: DBOOK [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: open3 in IO::Async::Process results in immediately dead child process
Reproducing script attached. I'm really not sure what's going on here but it seems after using IPC::Open3 to start a second child process inside an IO::Async::Process child, the second child is immediately dead and thus trying to read from its output handles just hangs. IPC::Open3 is supposed to throw an exception if it fails to start so I'm not sure how this is possible or how to debug.
Subject: vgg1wx.txt
#!/usr/bin/env perl use strict; use warnings; use IO::Async::Loop; use Future; use IPC::Open3; use Symbol 'gensym'; my $cmd = 'echo "stdout"'; #my $cmd = 'echo "stdout" && >&2 echo "stderr"'; my $out = ''; my $code = sub { # works with regular pipe open #my $pid = open my $stdout, '-|', $cmd or die "open failed: $!"; # hangs with open3 my $pid = open3(my $stdin, my $stdout, my $stderr = gensym, $cmd); print $_ while <$stdout>; waitpid $pid, 0; return $?; }; open my $fake_out, '>', \$out or die "open failed: $!"; select $fake_out; my $exit = $code->(); select STDOUT; print "STDOUT:\n$out\nExit: $exit\n"; $out = ''; my $loop = IO::Async::Loop->new; my $f = $loop->new_future; my $p = $loop->open_process( code => $code, stdout => { into => \$out }, stderr => { into => \my $err }, on_finish => sub { $f->done($_[1]) }, ); print "pid: " . $p->pid . "\n"; $exit = Future->wait_any($f, $loop->timeout_future(after => 10))->get; print "STDOUT:\n$out\nExit: $exit\n";
On Wed Dec 05 13:55:06 2018, DBOOK wrote: Show quoted text
> Reproducing script attached. I'm really not sure what's going on here > but it seems after using IPC::Open3 to start a second child process > inside an IO::Async::Process child, the second child is immediately > dead and thus trying to read from its output handles just hangs. > IPC::Open3 is supposed to throw an exception if it fails to start so > I'm not sure how this is possible or how to debug.
Warning that running this script results in an IO::Async::Process waiting on a dead process that you have to manually clean up.
On Wed Dec 05 13:55:06 2018, DBOOK wrote: Show quoted text
> Reproducing script attached. I'm really not sure what's going on here > but it seems after using IPC::Open3 to start a second child process > inside an IO::Async::Process child, the second child is immediately > dead and thus trying to read from its output handles just hangs. > IPC::Open3 is supposed to throw an exception if it fails to start so > I'm not sure how this is possible or how to debug.
Some curious behaviour here. I tried strace $ IO_ASYNC_DEBUG=2 strace -f -o perl.strace perl rt127928.pl on a slightly modified version of the script (basically just added more `print STDERR ...` debugging in places), and got the following highlights: Main process(5763) all proceeds as expected, until the fork call to set up the IaProcess: 5763 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7faaec33d450) = 5764 Perl-script child process(5764) then runs as far as the open3() call: 5764 write(2, "D1\n", 3) = 3 5764 pipe2([3, 4], O_CLOEXEC) = 0 ... 5764 pipe2([5, 6], O_CLOEXEC) = 0 ... 5764 pipe2( <unfinished ...> 5764 <... pipe2 resumed> [7, 8], O_CLOEXEC) = 0 ... 5764 pipe2( <unfinished ...> 5764 <... pipe2 resumed> [10, 11], O_CLOEXEC) = 0 So now it has some file descriptors. Quite why it does four pairs I don't yet know... 5764 clone( <unfinished ...> 5764 <... clone resumed> child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7faaec33d450) = 5765 open3() has now cloned its own child(5765) to run the binary in. It dup2()s the file descriptors to the right places 5765 dup2(3, 0) = 0 ... 5765 dup2(6, 1) = 1 ... 5765 dup2(8, 2) = 2 The originals were set to O_CLOEXEC. It then execve()s the echo binary: 5765 execve("/bin/sh", ["sh", "-c", "echo \"stdout\""], 0x5566ed215990 /* 48 vars */ <unfinished ...> that binary runs and eventually prints its output and exits: 5765 write(1, "stdout\n", 7 <unfinished ...> 5765 <... write resumed> ) = 7 5765 exit_group(0 <unfinished ...> 5765 <... exit_group resumed>) = ? Meantime, the process running the `code` inside the IaProcess has been trying to read: 5764 read(5, <unfinished ...> this eventually wakes up so it reports that back to its parent (the "print $_" part): 5764 <... read resumed> "stdout\n", 8192) = 7 5764 write(1, "stdout\n", 7 <unfinished ...> 5764 <... write resumed> ) = 7 By now, 5765 has exited, so 5764 receives its SIGCHLD, and ignores it 5764 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5765, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- But here's now where I get confused. The "... while <$stdout>" part is a loop, so naturally it's going to try reading again: 5764 read(5, <unfinished ...> ... and hereon blocks forever. This is the last system call that 5764 makes before the entire process group receives SIGINT because I hit Ctrl-C to stop the strace. What I think *ought* to have happened is that read() immediately returns zero, because it is trying to read fd5, which was a pipe whose other end was only open in the now-exited child. Surely? -- Paul Evans
Subject: perl.strace
Download perl.strace
application/octet-stream 209.5k

Message body not shown because it is not plain text.

Subject: rt127928.pl
#!/usr/bin/env perl use strict; use warnings; use IO::Async::Loop; use Future; use IPC::Open3; use Symbol 'gensym'; my $cmd = 'echo "stdout"'; #my $cmd = 'echo "stdout" && >&2 echo "stderr"'; my $code = sub { # works with regular pipe open #my $pid = open my $stdout, '-|', $cmd or die "open failed: $!"; # hangs with open3 print STDERR "D1\n"; my $pid = open3(my $stdin, my $stdout, my $stderr = gensym, $cmd); print STDERR "D2\n"; print $_ while <$stdout>; print STDERR "D3\n"; waitpid $pid, 0; print STDERR "D4\n"; return $?; }; my $out = ''; my $exit; if(0) { open my $fake_out, '>', \$out or die "open failed: $!"; select $fake_out; $exit = $code->(); select STDOUT; print "STDOUT:\n$out\nExit: $exit\n"; } $out = ''; my $loop = IO::Async::Loop->new; my $f = $loop->new_future; my $p = $loop->open_process( code => $code, stdout => { into => \$out }, stderr => { into => \my $err }, on_finish => sub { $f->done($_[1]) }, ); print "pid: " . $p->pid . "\n"; $exit = Future->wait_any($f, $loop->timeout_future(after => 10))->get; print "STDOUT:\n$out\nExit: $exit\n";
On Thu Dec 06 12:31:23 2018, PEVANS wrote: Show quoted text
> What I think *ought* to have happened is that read() immediately > returns zero, because it is trying to read fd5, which was a pipe whose > other end was only open in the now-exited child. Surely?
Oh. Turns out, no. Here's a complete log of what 5764 did from the fork() until the read(): 5764 <... clone resumed> child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7faaec33d450) = 5765 5764 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 5764 close(11) = 0 5764 read(10, "", 8192) = 0 5764 close(10) = 0 5764 close(8) = 0 5764 write(2, "D2\n", 3) = 3 5764 read(5, "stdout\n", 8192) = 7 5764 write(1, "stdout\n", 7) = 7 5764 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5765, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- 5764 read(5, <unfinished ...> While it close()ed its internal comms pipe and one of the three it set up for us, it didn't close the other two. Indeed, this can be seen in its source - here some lines from IPC/Open3.pm (VERSION = '1.20' on my system): 243 # If the write handle is a dup give it away entirely, close my copy 244 # of it. 245 xclose $handles[0]{parent}, $handles[0]{mode} if $handles[0]{dup}; It is specifically only closing the first handle. it doesn't bother with the other two, leaving them open. I elect this is an IPC::Open3 bug. -- Paul Evans
On Thu Dec 06 12:45:18 2018, PEVANS wrote: Show quoted text
> I elect this is an IPC::Open3 bug.
Aaaactually, this looks more complicated than that. I added some printing of opened filehandles according to the kernel, either side of the open3() call: print STDERR `lsof -p $$ +E`; my $pid = IPC::Open3::open3(my $stdin, my $stdout, my $stderr = gensym, $cmd); print STDERR `lsof -p $$ +E`; The output of which is: Before: perl 2954 leo 0u CHR 136,3 0t0 6 /dev/pts/3 perl 2954 leo 1w FIFO 0,12 0t0 43396 pipe 2953,perl,4r perl 2954 leo 2w FIFO 0,12 0t0 43397 pipe 2953,perl,6r perl 2954 leo 3r FIFO 0,12 0t0 40623 pipe 2955,lsof,1w perl 2954 leo 9w FIFO 0,12 0t0 43398 pipe 2953,perl,8r As was expected; STDIN remained from the terminal, STDOUT and STDERR are pipes to the parent process, fd 3 is the qx() pipe from lsof, and fd 9 is the status pipe for IO::Async::Process. After: perl 2954 leo 0u CHR 136,3 0t0 6 /dev/pts/3 perl 2954 leo 1w FIFO 0,12 0t0 43396 pipe 2953,perl,4r perl 2954 leo 2w FIFO 0,12 0t0 43397 pipe 2953,perl,6r perl 2954 leo 3r FIFO 0,12 0t0 40623 pipe perl 2954 leo 4r FIFO 0,12 0t0 43401 pipe 2954,perl,5w perl 2954 leo 5w FIFO 0,12 0t0 43401 pipe 2954,perl,4r perl 2954 leo 6r FIFO 0,12 0t0 43402 pipe 2954,perl,7w perl 2954 leo 7w FIFO 0,12 0t0 43402 pipe 2954,perl,6r perl 2954 leo 8r FIFO 0,12 0t0 43403 pipe perl 2954 leo 9w FIFO 0,12 0t0 43398 pipe 2953,perl,8r perl 2954 leo 10r FIFO 0,12 0t0 43405 pipe 2960,lsof,1w We have gained fds 4+5 and 6+7 for which we're holding *both* sides. Hrm... I further added logging inside IPC/Open3.pm to the xclose() function, having it print the values of filehandles it is attempting to close. It really tried: sub xclose { print STDERR "ENTER xclose($_[0]/IO=${\fileno $_[0]})\n"; ... Output: ENTER xclose(GLOB(0x557681ad8840)/IO=4) ENTER xclose(GLOB(0x557681b156d0)/IO=7) ENTER xclose(GLOB(0x557681b07df0)/IO=10) It tried to close 4, 7 and 10, but yet those remain open. I now compare to a working case where the child doesn't deadlock, by commenting out the part where the ->open_process call replaces the child's stderr. Now I get: Before: perl 4534 leo 0u CHR 136,3 0t0 6 /dev/pts/3 perl 4534 leo 1w FIFO 0,12 0t0 62923 pipe 4533,perl,4r perl 4534 leo 2u CHR 136,3 0t0 6 /dev/pts/3 perl 4534 leo 3r FIFO 0,12 0t0 64257 pipe 4535,lsof,1w perl 4534 leo 7w FIFO 0,12 0t0 62924 pipe 4533,perl,6r Output: ENTER xclose(GLOB(0x55ae34a8f2a8)/IO=4) ENTER xclose(GLOB(0x55ae34a8ef48)/IO=8) ENTER xclose(GLOB(0x55ae34a8ed98)/IO=10) After: perl 4534 leo 0u CHR 136,3 0t0 6 /dev/pts/3 perl 4534 leo 1w FIFO 0,12 0t0 62923 pipe 4533,perl,4r perl 4534 leo 2u CHR 136,3 0t0 6 /dev/pts/3 perl 4534 leo 3r FIFO 0,12 0t0 64257 pipe perl 4534 leo 4r FIFO 0,12 0t0 62925 pipe 4534,perl,5w perl 4534 leo 5w FIFO 0,12 0t0 62925 pipe 4534,perl,4r perl 4534 leo 6r FIFO 0,12 0t0 62926 pipe perl 4534 leo 7w FIFO 0,12 0t0 62924 pipe 4533,perl,6r perl 4534 leo 8r FIFO 0,12 0t0 62929 pipe 4538,lsof,1w perl 4534 leo 9r FIFO 0,12 0t0 62927 pipe It still manages to be holding both sides of the fd 4+5 pair, but its attempt to close 8 and 10 were successful (8 has been reused for lsof, 10 is now closed) Something oddly strange is occurring, whereby perl code executes the close() operator, but that doesn't always result in a close(2) system call. Whether or not it does, seems to depend on what happens to STDERR. -- Paul Evans
Attached a new test script using IO::Async::Process (with its own loop) for the secondary subprocess, can't seem to get this to work either, but don't know if it's a related problem.
Subject: ma8sj9.txt
#!/usr/bin/env perl use strict; use warnings; BEGIN { $ENV{IO_ASYNC_LOOP} = 'Poll' } use IO::Async::Loop; use Future; use IPC::Open3; use Symbol 'gensym'; my $cmd = 'echo ' . ('stdout' x 100); #my $cmd = 'echo "stdout" && >&2 echo "stderr"'; my $out = ''; my $code = sub { # works with regular pipe open #my $pid = open my $stdout, '-|', $cmd or die "open failed: $!"; # hangs with open3 #my $pid = open3(my $stdin, my $stdout, my $stderr = gensym, $cmd); my $loop = IO::Async::Loop->really_new; my $f = $loop->new_future; my $process = $loop->open_process( command => $cmd, stdout => { on_read => sub { my ($stream, $buffref) = @_; print $1 if $$buffref =~ s/(.*)//s; return 0; }, }, #stderr => { # on_read => sub {0}, #}, on_finish => sub { $f->done($_[1]) }, ); return $f->get; }; open my $fake_out, '>', \$out or die "open failed: $!"; select $fake_out; my $exit = $code->(); select STDOUT; print "STDOUT:\n$out\nExit: $exit\n"; $out = ''; my $loop = IO::Async::Loop->new; my $f = $loop->new_future; my $p = $loop->open_process( code => $code, stdout => { into => \$out }, #stderr => { into => \my $err }, on_finish => sub { $f->done($_[1]) }, ); print "pid: " . $p->pid . "\n"; $exit = Future->wait_any($f, $loop->timeout_future(after => 10))->get; print "STDOUT:\n$out\nExit: $exit\n";
For posterity, I am unable to reproduce this problem using IPC::Run3 in the child.