Subject: | UTF-8 not handled properly |
When sending a HTTP message that as constructed with UTF-8 characters, the server is sending truncated messages.
I found this by testing a server with soapUI, the message payload that is a complex XML structure using UTF-8 characters and is stored as a string (with utf8 flag) to be passed to SOAP::Lite for transmission. The resulting message is truncated by a few bytes.
I traced through the code and found that the content-length header is being calculated properly but the message content is being re-encoded in Latin 1 which is causing the UTF-8 characters to multiply in size, thus causing the client to stop reading prematurely and truncating the message.
The issue is that SOAP::Lite is using the wrong method to retrieve the data from HTTP::Message. It should be using decoded_content() instead of just plain content(). I created a patch to fix this.
This patch also impacts a test case: SOAP/Transport/HTTP/CGI.t
The test payload in this test case uses a UTF-8 string cut and pasted from somewhere, but doesn't use the 'utf8' pragma or uses Enocde::_utf8_on to tell Perl that it is a UTF-8 string or that the test script has a UTF-8 character in its source. I updated the test to replace the UTF-8 character with the code '\x{DC}' which causes the string to be built properly. The test server also needed to flag its STDOUT and STDIN as well.
The attached patch is for .715 but has been tested with .716 as well.
Subject: | SOAP-Lite-0.715-utf8_correction.patch |
diff -uNr SOAP-Lite-0.715.orig/lib/SOAP/Transport/HTTP.pm SOAP-Lite-0.715/lib/SOAP/Transport/HTTP.pm
--- SOAP-Lite-0.715.orig/lib/SOAP/Transport/HTTP.pm 2012-07-15 05:18:44.000000000 -0400
+++ SOAP-Lite-0.715/lib/SOAP/Transport/HTTP.pm 2013-07-02 13:07:38.930105900 -0400
@@ -615,7 +615,7 @@
print STDOUT "$status $code ", HTTP::Status::status_message($code),
"\015\012", $self->response->headers_as_string("\015\012"), "\015\012",
- $self->response->content;
+ $self->response->decoded_content;
}
# ======================================================================
diff -uNr SOAP-Lite-0.715.orig/t/SOAP/Transport/HTTP/CGI/test_server.pl SOAP-Lite-0.715/t/SOAP/Transport/HTTP/CGI/test_server.pl
--- SOAP-Lite-0.715.orig/t/SOAP/Transport/HTTP/CGI/test_server.pl 2010-06-03 11:33:24.000000000 -0400
+++ SOAP-Lite-0.715/t/SOAP/Transport/HTTP/CGI/test_server.pl 2013-07-02 13:15:54.915699500 -0400
@@ -7,11 +7,14 @@
dispatch_to => 'main'
);
+binmode STDIN, ":utf8";
+binmode STDOUT, ":utf8";
+
$soap->handle();
sub test {
my ($self, $envelope) = @_;
- return SOAP::Data->name('testResult')->value('Ãberall')->type('string');
+ return SOAP::Data->name('testResult')->value("\x{dc}berall")->type('string');
}
diff -uNr SOAP-Lite-0.715.orig/t/SOAP/Transport/HTTP/CGI.t SOAP-Lite-0.715/t/SOAP/Transport/HTTP/CGI.t
--- SOAP-Lite-0.715.orig/t/SOAP/Transport/HTTP/CGI.t 2010-06-03 11:33:24.000000000 -0400
+++ SOAP-Lite-0.715/t/SOAP/Transport/HTTP/CGI.t 2013-07-02 13:15:45.540762100 -0400
@@ -56,7 +56,7 @@
if ($] >= 5.008) {
ok utf8::is_utf8($result), 'return utf8 string';
{
- is $result, 'Ãberall', 'utf8 content: ' . $result;
+ is $result, "\x{dc}berall", 'utf8 content: ' . $result;
}
}
else {