Skip Menu |

This queue is for tickets about the System-Command CPAN distribution.

Report information
The Basics
Id: 93056
Status: open
Priority: 0/
Queue: System-Command

People
Owner: Nobody in particular
Requestors: djerius [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 1.107
Fixed in: (no value)



Subject: corrupted binary pipes on Win32
Executive Summary: Simply setting binmode on the child's filehandles will not properly enable binmode in Win32, due to how IPC::Run implements things. I don't think that a version of IPC::Run which would transparently support setting binmode on filehandles is likely, so properly supporting binary data streams in System::Command will require an explicit binmode parameter. The hideous details: Because of the way that pipes are handled on Win32 via IPC::Run, setting binmode on the child's filehandles does not actually result in binary mode. For example, the following code ------------------------------------------------------------------------------ use System::Command; my $buf = pack('N',13); binmode( \*STDOUT ); $cmd = System::Command->new( 'od', '-c' ); binmode $cmd->stdin; binmode $cmd->stdout; $cmd->stdin->print( $buf ); $cmd->stdin->close; print $cmd->stdout->getline; ------------------------------------------------------------------------------ when run with binmode properly on results in ----------------------- 0000000 \0 \0 \0 \r ----------------------- Note the "\r". When this is run on Win32, it results in ------------------- 0000000 \0 \0 \0 ------------------- Note the missing "\r". I know almost nothing about Win32, but I think the following analysis is correct. System::Command uses IPC::Run on Win32. IPC::Run uses a helper script (IPC::Run::Win32Pump) to shuttle data between processes (apparently using temporary files and copying data from one file to another). IPC::Run::Win32Pump has the following code [1] 110 $buf =~ s/\r//g unless $binmode; This strips out all carriage returns regardless if they are actually followed by a new line character. In the case of my binary data, that removes the \r, corrupting the stream. I've confirmed that this line is the culprit by commenting out that line and verifying that the test program above works as expected.[1] The "$binmode" parameter on that line is essentially set by specifying the IPC::Run binary filter. For example, in System::Command, this 61 # start the command 62 if (MSWin32) { 63 $pid = IPC::Run::start( 64 [@cmd], 65 '<pipe' => $in, 66 '>pipe' => $out, 67 '2>pipe' => $err, 68 ); 69 } becomes this: 61 # start the command 62 if (MSWin32) { 63 $pid = IPC::Run::start( 64 [@cmd], 65 '<pipe' => IPC::Run::binary(), $in, 66 '>pipe' => IPC::Run::binary(), $out, 67 '2>pipe' => IPC::Run::binary(), $err, 68 ); 69 } and the test script works as expected. Comments to the effect of Perl's binmode function not seeming to work on Win32 appear several times in IPC::Run, but at least under 5.18.2, they do seem to work: ------------------------------------------------------------------------------ perl -E 'say pack("N", 13)' | od -c 0000000 \0 \0 \0 \0 \r \r \n perl -E 'binmode \*STDOUT; say pack("N", 13)' | od -c 0000000 \0 \0 \0 \0 \r \n ------------------------------------------------------------------------------ I really don't know if it's possible that IPC::Run could be fixed so that binmode() on the child's filehandles will work, but there's not much activity going on in that codebase, so I think that a solution to get System::Command working on Win32 needs to be done in System::Command. Unfortunately, I think an explicit binmode option may need to be provided by System::Command so that it can be appropriately applied for Windows and other systems. Thanks, Diab [1] Somewhat more sophisticated code is found elsewhere in IPC::Run::Win32IO 206 if ( $self->binmode ) { 207 $data_ref = $self->{SOURCE}; 208 } 209 else { 210 my $data = ${$self->{SOURCE}}; # Ugh, a copy. 211 $data =~ s/(?<!\r)\n/\r\n/g; 212 $data_ref = \$data; 213 } [...] 272 $s =~ s/\r\n/\n/g unless $self->binmode; The code base is complex, so I'm not sure if code to manually fix the crlf is actually needed.
Sorry for the long delay. I've been performing what feels like a random walk through the IPC::Run code. I believe that there are two problems: * It doesn't respond to the binmode() operation * It doesn't properly support the default (text) mode CRLF encoding/decoding. Part of the complexity in the code is that it creates a separate "pumper" process on each filehandle, as according to the author, Windows only allows selects on a socket, so there's some complex gymnastics involving sockets and pipes and child processes that I haven't quite figured out. Anyway, here's what I've got so far: From the Perl docs for binmode, binmode indicates that data are treated as bytes and that no alterations to the bytes are done. In particular on Win32, that means that the newline character "\n" (which is used by Perl to indicate end-of-line) is not translated into "\r\n", which is the proper sequence on Win32. This is the same as the ':raw' PerlIO layer. On Windows, the default mode PerlIO mode should be :crlf mode, which is documented (in PerlIO) as :crlf A layer that implements DOS/Windows like CRLF line endings. On read converts pairs of CR,LF to a single "\n" newline character. On write converts each "\n" to a CR,LF pair. Note that this layer will silently refuse to be pushed on top of itself. I've attached a test program which illustrates what I think is going on. All tests use the following string, which has an extra unpaired \r. foo\n\r\n or in hex 66 6f 6f 0a 0d 0a All expected results are determined by directly writing the test string to files and dumping them or by directly reading files containing the raw (internal Perl, binary) or properly crlf encoded test string Here's the annotated output on Windows XP using Strawberry Perl. default write to pipe: expected: 66 6f 6f 0d 0a 0d 0d 0a got : 66 6f 6f 0a 0a * the 0a characters should have been translated to 0d 0a, but weren't * the extra 0d character is missing. binary write to pipe: expected: 66 6f 6f 0a 0d 0a got : 66 6f 6f 0a 0a * the extra 0d character is missing default read of non-crlf encoded data: expected: 66 6f 6f 0a 0a got : 66 6f 6f 0a 0a * The fiducial result confuses me, as my reading of :crlf mode is that *pairs* of \r\n characters are converted to \n. The extra '\r' has (unexpectedly) disappeared from the fiducial output. binary read of non-crlf encoded data: expected: 66 6f 6f 0a 0d 0a got : 66 6f 6f 0a 0a * the extra 0d character is missing default read of crlf encoded data expected: 66 6f 6f 0a 0d 0a got : 66 6f 6f 0a 0a * the extra 0d character is missing binary read of crlf encoded data expected: 66 6f 6f 0d 0a 0d 0d 0a got : 66 6f 6f 0a 0a * all of the 0d characters are missing The important things to notice are: * as mentioned in the original bug report, binary mode is broken; \r is stripped. * default (text) mode is also broken. * on output, crlf encoding is not performed (and the extra \r is stripped) * on input, crlf decoding strips out the extra \r. Here's what I think is going on. IPC::Run transfers data to and from the child process on Win32 using one of two methods: 1. If the data are read from scalars or named files, temporary files are used. 2. otherwise, a "pumper" process is used. The latter is what is used when IPC::Run is called by System::Command. The bugs are a combination of two things: 1. The main program communicates with the pumper process via sockets, which default to binmode on Win32. If I alter the IPC::Run code to reset the sockets to text mode, text mode works. 2. IPC::Run::Win32IO implements non-binmode internally, but as far as I can tell, doesn't distinguish between translating to/from "\n" to "\r\n"; it always simply removes "\r", regardless of whether it is paired with a "\n". I _think_ that a couple of simple changes to IPC::Run might fix things (removing the internal binmode code, reseting the sockets to text mode), but as I mentioned before, there's not much movement in that code base. I recommend finding another option for Windows. If one exists. I've tried IPC::Exe, but that also has problems (https://rt.cpan.org/Ticket/Display.html?id=93112). I'll continue working on this, at the least to post a bug report for IPC::Run. Thanks for your patience. Diab
Subject: sys_cmd.pl
use System::Command; use IO::File; use strict; use warnings; my $tstr = "foo\n\r\n"; our @hd_f = ( $^X, '-e', '$/=undef; open STDIN, $ARGV[0]; binmode STDIN; print join( q/ /, map { unpack( q/H*/, $_) } split //, <STDIN> ), qq/\n/' ); our @hd_s = ( $^X, '-e', '$/=undef; binmode STDIN; print join( q/ /, map { unpack( q/H*/, $_) } split //, <STDIN> ), qq/\n/' ); our @cat = ( $^X, '-e', '$/=undef; open STDIN, $ARGV[0]; binmode STDIN; binmode STDOUT; print STDOUT <STDIN>;' ); sub hd { join( q/ /, map { unpack( q/H*/, $_) } split //, shift ), qq/\n/; } # these must work, or Perl is borked sub fiducial_write { my ( $write_mode ) = @_; my $mode = $write_mode eq 'default' ? '>' : '>:raw' ; IO::File->new( $write_mode, $mode )->print( $tstr ); print "expected: "; system( @hd_f, $write_mode ); } sub fiducial_read { my ( $write_mode, $read_mode ) = @_; my $mode = $read_mode eq 'default' ? '<' : '<:raw' ; IO::File->new( $write_mode, $mode )->read( my $buf, 1000 ); print "expected: ", hd( $buf ); } { my $cmd = System::Command->new( @hd_s ); $cmd->stdin->print( $tstr ); $cmd->stdin->close; print "default write to pipe:\n"; fiducial_write( 'default' ); print "got : ", $cmd->stdout->getline; print "\n"; } { my $cmd = System::Command->new( @hd_s ); binmode $cmd->stdin; $cmd->stdin->print( $tstr ); $cmd->stdin->close; print "binary write to pipe:\n"; fiducial_write( 'binary' ); print "got : ", $cmd->stdout->getline; print "\n"; } { my $cmd = System::Command->new( @cat, 'binary' ); $cmd->stdin->close; $cmd->stdout->read( my $buf, 100 ); print "default read of non-crlf encoded data: \n"; fiducial_read( 'binary', 'default' ); print "got : ", hd( $buf ); print "\n"; } { my $cmd = System::Command->new( @cat, 'binary' ); binmode $cmd->stdout; $cmd->stdin->close; $cmd->stdout->read( my $buf, 100 ); print "binary read of non-crlf encoded data: \n"; fiducial_read( 'binary', 'binary' ); print "got : ", hd( $buf ); print "\n"; } { my $cmd = System::Command->new( @cat, 'default' ); $cmd->stdin->close; $cmd->stdout->read( my $buf, 100 ); print "default read of crlf encoded data\n"; fiducial_read( 'default', 'default' ); print "got : ", hd( $buf ); print "\n"; } { my $cmd = System::Command->new( @cat, 'default' ); binmode $cmd->stdout; $cmd->stdin->close; $cmd->stdout->read( my $buf, 100 ); print "binary read of crlf encoded data \n"; fiducial_read( 'default', 'binary' ); print "got : ", hd( $buf ); print "\n"; }
Subject: Re: [rt.cpan.org #93056] corrupted binary pipes on Win32
Date: Thu, 28 Aug 2014 00:07:17 +0200
To: Diab Jerius via RT <bug-System-Command [...] rt.cpan.org>
From: "Philippe Bruhat (BooK)" <book [...] cpan.org>
On Tue, Jul 22, 2014 at 12:53:53AM -0400, Diab Jerius via RT wrote: Show quoted text
> > Sorry for the long delay. I've been performing what feels like a > random walk through the IPC::Run code. I believe that there are > two problems: > > * It doesn't respond to the binmode() operation > * It doesn't properly support the default (text) mode CRLF > encoding/decoding.
Many thanks for the detailed reports. If I understood both your emails correctly, you identified the problem in detail, but have no solution (yet). Show quoted text
> I recommend finding another option for Windows. If one exists. I've > tried IPC::Exe, but that also has problems > (https://rt.cpan.org/Ticket/Display.html?id=93112). > > I'll continue working on this, at the least to post a bug report for IPC::Run.
I'm interested in finding/using another option for Windows. If one exists. I think Capture::Tiny probably has the most comprehensive list of options for an IPC::Run replacement: https://metacpan.org/pod/Capture::Tiny#SEE-ALSO If I remember correctly, another issue with IPC::Run is that starting up the pumper program (or spawning the actual process?) takes some time, and when running a lot of commands through System::Command, the equivalent programs running under Linux and Windows can be significantly slower. I noticed that looking at the test suite for Git::Repository (which is the main user of System::Command) running. There are several IPC modules that support Win32, most of them for capturing program output. The specific thing I'm looking for with System::Command is the ability to obtain filehandles connected to the subprocess STDIN, STDOUT and STDERR, so that it's possible to deal with potentially infinite streams line by line. -- Philippe Bruhat (BooK) Treat those you outrank well... you never know when they will outrank you. (Moral from Groo #7 (Image))