Skip Menu |

This queue is for tickets about the Data-Serializer CPAN distribution.

Report information
The Basics
Id: 68125
Status: resolved
Priority: 0/
Queue: Data-Serializer

People
Owner: neil [...] neely.cx
Requestors: colink [...] perlDreamer.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.57
Fixed in: (no value)



Subject: Data::Serializer::JSON doesn't handle non-ASCII text well
If you use Data::Serializer::JSON, and try to serialize and deserialize references with high-byte characters or UTF-8, then what's returned to you isn't encoded propertly. I wrote a small test that shows this, and the patch will fix the problem. I apologize for the test, since it isn't written using ExtUtils::TBone, but it should still show the essence of the problem.
Subject: encoding.tar
Download encoding.tar
application/x-tar 10k

Message body not shown because it is not plain text.

What version of JSON are you using? The patch you are planning on removing was provided by Makamaka the author of JSON.pm (see Changes). Before implementing the patch I'd like to have a greater understanding of when this is a problem and for what versions (In case the original patch is good for some versions of JSON, but not others). Since I don't use utf8 in any day to day capacity I'm relying on external feedback, and I'd like to be cautious before replacing the patch that the original module author supplied. I just want to be sure we're not fixing one set of utf8 by breaking another. Thanks, Neil
From: colink [...] perlDreamer.com
I'm using JSON 2.51. The tests I provided along with the patch test both high-bit characters (things greater than ASCII code 127) and UTF-8. The problem with Data::Serializer::JSON now is that it only encodes going 1-way. We found this when we were using CHI and replaced the default Storable serializer with JSON; all the pages with UTF-8 and high-bit characters (like Microsoft smart-quotes) all started displaying strangely. I understand being cautious about the patch, if the author of JSON.pm provided the original he should certainly know what he's doing! Maybe he'd like to review the patch and the tests, or maybe someone else who's familiar with encodings?
From: lm [...] sunnyspot.org
I've experienced the same issue. My solution simply involves adding "->utf8" before "->decode" call on line #14, as already done with encode call in serialize method.
RT-Send-CC: lm [...] sunnyspot.org
I've uploaded version .58 with the fix both of you supplied for this (they were functionally equivalent patches). I added in the tests from the original supplied patch, so utf8 tests are running through the gamut of tests for all Serializers. XML::Dumper choked badly on utf8 and didn't have any obvious work arounds, for now the testing of UTF8 for it was turned off. Thank you for your patience in getting this bug resolved.