Skip Menu |

This queue is for tickets about the JSON CPAN distribution.

Report information
The Basics
Id: 43427
Status: resolved
Priority: 0/
Queue: JSON

People
Owner: Nobody in particular
Requestors: mikie [...] google.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Re: Question on JSON::PP handling of \u00xx
Date: Thu, 19 Feb 2009 11:51:58 +0000
To: bug-JSON [...] rt.cpan.org, makamaka [...] cpan.org
From: Mika Raento <mikie [...] google.com>
Ah, that doesn't solve it completely. Setting $is_utf8 will make the decode call utf8::decode() on the result which turns the nice character string into bytes. This seems more complicated than I thought :-( Mika On Thu, Feb 19, 2009 at 11:23 AM, Mika Raento <mikie@google.com> wrote: Show quoted text
> Hiya Makamaka > > I'm trying to JSON::PP with non-ascii characters and I find the > behaviour a bit odd. Characters in the range 127-255 are _not_ made > into utf-8 / perl characters on decode, instead they come out as bytes > with those values. This makes it difficult to handle strings with > those characters as I'd need to go through the results and > utf8::upgrade() everything. > > It's simple to fix - just replace > if ((my $hex = hex( $u )) > 255) { > $is_utf8 = 1; > $s .= JSON_PP_decode_unicode($u) || next; > } > with > if ((my $hex = hex( $u )) > 127) { > $is_utf8 = 1; > $s .= JSON_PP_decode_unicode($u) || next; > } > JSON/PP.pm around line 804. > > However, there are a number of tests that check that we can get bytes > 128-255 out as-is so it looks like this behaviour was intended, at > least on some level. > > Thoughts? > > Thanks, > Mika Raento > > -- > Google UK Limited > Registered Office: Belgrave House, 76 Buckingham Palace Road, London SW1 9TQ > Registered in England Number: 3977902 >
-- Google UK Limited Registered Office: Belgrave House, 76 Buckingham Palace Road, London SW1 9TQ Registered in England Number: 3977902
Show quoted text
> Ah, that doesn't solve it completely. Setting $is_utf8 will make the > decode call utf8::decode() on the result which turns the nice > character string into bytes. This seems more complicated than I > thought :-( > > Mika
I replied rt#43424, so close this. Regards,