Subject: | unicode data not correct encoded |
The attached file is based on the example in:
http://stackoverflow.com/questions/9365402/how-to-convince-soaplite-to-return-utf-8-data-in-responses-as-utf-8
I had the same problem as the poster. I added an en-dash to the example
data as that was the character caused me grief. Basically strings with
is_utf8 set on a being picked up by the base64 match ( they contain a
character with value > 0x7F) whereas if they are unicode strings
(perversely is_utf8($val) =1) they should be treated as strings. The
problem is that the de-serializer doesn't know if the string should be
unicode or not - and leaves it alone.
Uncommenting the $ser->typelookup... effectively fixes the problem,
though it may have unforseen consequences.
Subject: | SoapLitEncode.pm |
use strictures;
use Test::More;
use SOAP::Lite;
use utf8;
use Data::Dumper;
my $data = "mü\x{2013}";
my $ser = SOAP::Serializer->new;
$ser->typelookup->{trick_into_ignoring} = [9, \&utf8::is_utf8 ,'as_utf8_string'];
my $xml = $ser->envelope( freeform => $data );
my ( $cycled ) = values %{ SOAP::Deserializer->deserialize( $xml )->body };
is( length( $data ), length( $cycled ), "UTF-8 string is the same after serializing" );
done_testing;
sub check_utf8 {
my ($val) = @_;
return utf8::is_utf8($val);
}
package SOAP::Serializer;
sub as_utf8_string {
my $self = shift;
my($value, $name, $type, $attr) = @_;
return $self->as_string($value, $name, $type, $attr);
}
1;