Subject: | utf8 conversion does not work |
Date: | Tue, 10 Jun 2008 16:18:46 +0200 |
To: | <bug-JSON [...] rt.cpan.org> |
From: | Morten Bjørnsvik <morten.bjornsvik [...] experian-da.no> |
Hi
I'm unable to get reliable transformation of norwegian characters into JSON and then back into perl
I used a Unicode utf8 font for testing.
#!/opt/perl/bin/perl -w
use Data::Dumper;
use JSON;
# just a perl test structure
my $orig = {
desc => 'norwegian characters:',
c1 => ['Æ','æ','Ø','ø','Å','å'],
c2 => ["ÆæØøÅå"],
};
print "Original perl hashref: ", Dumper($orig);
my $json1 = to_json($orig, {ascii=>1});
print "json text:", $json1, "\n";
my $perl1 = from_json($json1, {ascii=>1});
print "back to perl hashref:", Dumper($perl1);
With ascii we get correct back converting, but the json is broken
Original perl hashref: $VAR1 = {
'desc' => 'norwegian characters:',
'c2' => [
'ÆæØøÅå'
],
'c1' => [
'Æ',
'æ',
'Ø',
'ø',
'Å',
'å'
]
};
json text:{"desc":"norwegian characters:","c2":["\u00c3\u0086\u00c3\u00a6\u00c3\u0098\u00c3\u00b8\u00c3\u0085\u00c3\u00a5"],"c1":["\u00c3\u0086","\u00c3\u00a6","\u00c3\u0098","\u00c3\u00b8","\u00c3\u0085","\u00c3\u00a5"]}
back to perl hashref:$VAR1 = {
'desc' => 'norwegian characters:',
'c2' => [
'ÆæØøÅå'
],
'c1' => [
'Æ',
'æ',
'Ø',
'ø',
'Å',
'å'
]
};
With utf8=>1 everything is broken:
Original perl hashref: $VAR1 = {
'desc' => 'norwegian characters:',
'c2' => [
'ÆæØøÅå'
],
'c1' => [
'Æ',
'æ',
'Ø',
'ø',
'Å',
'å'
]
};
json text:{"desc":"norwegian characters:","c2":["ÃæÃøÃÃ¥"],"c1":["Ã","æ","Ã","ø","Ã","Ã¥"]}
back to perl hashref:$VAR1 = {
'desc' => 'norwegian characters:',
'c2' => [
"\x{c3}\x{86}\x{c3}\x{a6}\x{c3}\x{98}\x{c3}\x{b8}\x{c3}\x{85}\x{c3}\x{a5}"
],
'c1' => [
"\x{c3}\x{86}",
"\x{c3}\x{a6}",
"\x{c3}\x{98}",
"\x{c3}\x{b8}",
"\x{c3}\x{85}",
"\x{c3}\x{a5}"
]
};
--
Morten Bjørnsvik
Experian Decision Analytics AS
PB 121, 0102 Oslo, Norway
Morten.bjornsvik@experian-da.no <mailto:Morten.bjornsvik@experian-da.no>
Message body is not shown because it is too large.