Subject: | iso-2022-jp sometimes doesn't consume input |
See also https://rt.perl.org/Ticket/Display.html?id=126719 where this was reported as a perl bug.
The attached code doesn't produce any decoded data.
I'm not familiar with the encoding involved, but I would have expected if those initial bytes were invalid, they would be replaced with \xHH for each invalid byte (per PERLQQ), and if they were valid, returned valid characters.
(The bytes appear to be a UTF-8 \x{2013}, but that's irrelevant to the mis-behaviour.)
Tony
Subject: | 126713d.pl |
#!perl
use strict;
use Encode;
my $str = "\xE2\x80\x93 Europa ...\n";
my $check = Encode::PERLQQ()|Encode::WARN_ON_ERR()|Encode::STOP_AT_PARTIAL();
my $enc = Encode::find_encoding('iso-2022-jp') or die;
my $chars = $enc->decode($str, $check);
print "str ", display($str), "\n";
print "chars ", display($chars), "\n";
sub display {
$_[0] =~ s/([^\x21-\x7e])/sprintf("\\x{%x}", ord($1))/ger;
}