Subject: | Text::Unidecode for any charset |
It would be nice if Text::Unidecode could decode into any arbitrary
charset, not only ascii. A sample, probably inefficient implemention
could look like this:
use Text::Unidecode;
use Encode qw(encode);
use charnames qw(:full);
$tocharset = "iso-8859-1";
$x = "\xfc\x{20ac}\N{HORIZONTAL ELLIPSIS}\N{LEFT DOUBLE QUOTATION MARK}";
$res = "";
for (split //, $x) {
my $conv = encode($tocharset, $_);
if ($_ ne "?" && $conv eq "?") {
$res .= unidecode($_);
} else {
$res .= $conv;
}
}
print $res, "\n";
__END__
Regards,
Slaven