Subject: | Pumping UTF-8 data mangles it |
Attached is a self-contained test script that shows how pump does not
manage to send all UTF-8 data to the child, despite the environment vars
PERL_UNICODE=SDA LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 being set.
I tried looking through IPC/Run.pm and by adding "use bytes;" to
_write() that part will at least send the data correctly.
However, I couldn't get it to read back the output from the child as
UTF-8; it just reads the raw bytes interpreted as ISO-8859-1, and
forcing all _read() strings to UTF-8 seemed like a really bad idea...
Somewhere there must be a place where it does not respect the global
binmode, since PERL_UNICODE=SDA should force all I/O to UTF-8 semantics.
Subject: | ipc-run-fail.pl |
#!/usr/bin/perl
# Runtime environment: export LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 PERL_UNICODE=SDA
BEGIN { $| = 1; }
use utf8;
use strict;
use warnings;
if (! defined $ENV{IPC_SUB_PROCESS}) {
$ENV{IPC_SUB_PROCESS} = 1;
use IPC::Run qw( start pump finish timeout binary );
my $in;
my $out;
my @cmd = (__FILE__);
my $h = start \@cmd, \$in, \$out, timeout(2);
my @lines = ("abc", "æçæ室", "æøå");
foreach my $line (@lines) {
print "Handler input: $line\n";
$in = $line;
$in .= "\n";
pump $h until $out =~ /.+/s;
print "Handler output: $out\n";
$out = '';
}
}
else {
while (my $line = <STDIN>) {
$line =~ s/^\s+//g;
$line =~ s/\s+$//g;
print STDERR "Daemon input/output: ", $line, "\n";
print $line, "\n";
}
}