Skip Menu |

This queue is for tickets about the Unicode-MapUTF8 CPAN distribution.

Report information
The Basics
Id: 22129
Status: new
Priority: 0/
Queue: Unicode-MapUTF8

People
Owner: Nobody in particular
Requestors: MSCHILLI [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.11
Fixed in: (no value)



Subject: Ascii chars dropped when converting to BIG5
Thanks for creating this module, it's very useful! Seems like Unicode-MapUTF8's from_utf8() function gets confused though if there's leading ascii characters in the string, which are dropped in the result. Here's an example, using Encode's from_to() function to show that Encode handles the conversion fine while Unicode-MapUTF8 drops the leading ascii chars: ./maputf8.pl orig_str = [Foobar Hongkong Limited ª©Åv©Ò¦³ ¤£±oÂà¸ü ] from big5 to utf8 = [Foobar Hongkong Limited çæ¬ææ ä¸å¾è½è¼ ] from utf8 to big5 = [Foobar Hongkong Limited ª©Åv©Ò¦³ ¤£±oÂà¸ü ] big5 str converted by MapUTF8 = [ª©Åv©Ò¦³¤£±oÂà¸ü ] Would be great if you could look into it, thanks! Mike Schilli m@perlmeister.com
Subject: maputf8.pl
#!/usr/local/bin/perl -w use 5.008; use strict; use Unicode::MapUTF8 qw(to_utf8 from_utf8 utf8_supported_charset); use Encode qw(from_to); my $big5_str = "Foobar Hongkong Limited \xaa\xa9\xc5v\xa9\xd2\xa6\xb3 \xa4\xa3\xb1o\xc2\xe0\xb8\xfc"; my $encoding = "big5"; print "orig_str = [$big5_str ]\n"; my $big5_str_bk = $big5_str; my $b2u = from_to($big5_str_bk, "big5", "utf8"); print "from big5 to utf8 = [$big5_str_bk ]\n"; my $utf8_str = $big5_str_bk; my $u2b = from_to($big5_str_bk, "utf8", "big5"); print "from utf8 to big5 = [$big5_str_bk ]\n"; my $conv_str = from_utf8({ -string =>$utf8_str, -charset => $encoding}); print "big5 str converted by MapUTF8 = [$conv_str ]\n";