Skip Menu |

This queue is for tickets about the XML-Dumper CPAN distribution.

Report information
The Basics
Id: 52071
Status: open
Priority: 0/
Queue: XML-Dumper

People
Owner: Nobody in particular
Requestors: dmuey [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 0.81
Fixed in: (no value)



Subject: de-serialization is not always "round-trip" safe
strings, array values, and hash keys and values that are utf-8 are returns as hex strings. This patch allows the same exact data sent to pl2xml() get returned from xml2pl (e.g. Test::More::is_deeply() won't fail, you won't get "Wide character" warnings. Patch and test script will be attached next.
There may be a better solution but this simply works. The oddity to keep in mind is_utf8() apparently returns true, so it does not get encode() when used as a condition. You do not need to 'use/no utf8' to make these functions available. I tried defining the 'Char' Handler for XML::Parser but init() (called w/ out args in xml2pl()) blows away any you define. If you hack the setting in then it only applies to values but not keys ...
--- /usr/local/lib/perl5/site_perl/5.8.9/XML/Dumper.pm.orig 2009-11-25 13:58:55.000000000 -0600 +++ /usr/local/lib/perl5/site_perl/5.8.9/XML/Dumper.pm 2009-11-25 13:58:33.000000000 -0600 @@ -570,6 +570,7 @@ my $item_tree = $tree->[$i+1][$j+1]; if( exists $item_tree->[0]{ key } ) { my $key = $item_tree->[ 0 ]{ key }; + utf8::encode($key); # if !utf8::is_utf8($item); # rt52071 if( exists $item_tree->[ 0 ]{ 'defined' } ) { if( $item_tree->[ 0 ]{ 'defined' } =~ /false/ ) { $ref->{ $key } = undef; @@ -619,6 +620,7 @@ my $item_tree = $tree->[$i+1][$j+1]; if( exists $item_tree->[0]{ key } ) { my $key = $item_tree->[0]{ key }; + utf8::encode($key); # if !utf8::is_utf8($item); # rt52071 if( exists $item_tree->[ 0 ]{ 'defined' } ) { if( $item_tree->[ 0 ]{ 'defined' } =~ /false/ ) { $ref->[ $key ] = undef; @@ -658,6 +660,7 @@ if( /^0$/ ) { # SIMPLE SCALAR # ---------------------------------------- $item = $tree->[$i + 1]; + utf8::encode($item); # if !utf8::is_utf8($item); # rt52071 } } }
[ -- as is -- ] # perl xmltest.perl 1..1 Use of uninitialized value in concatenation (.) or string at xmltest.perl line 27. # Orig: π XML: not ok 1 - same struct # Failed test 'same struct' # at xmltest.perl line 28. Wide character in print at /usr/local/lib/perl5/5.8.9/Test/Builder.pm line 1698. # Structures begin differing at: # $got->{π} = Does not exist # $expected->{π} = 'π' # Looks like you failed 1 test of 1. # [ -- w/ patch in previous post -- ] # perl xmltest.perl 1..1 # Orig: π XML: π ok 1 - same struct # [ -- summary -- ] $got->{π} Does not exist because it is '\x{3c0}' Wide character in print also because it is misencoded.