Skip Menu |

This queue is for tickets about the WebService-Solr CPAN distribution.

Report information
The Basics
Id: 47012
Status: stalled
Priority: 0/
Queue: WebService-Solr

People
Owner: Nobody in particular
Requestors: knutolav [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.06
Fixed in: (no value)



Subject: UTF-8 double encoding with already encoded text
If some fields already contain multi-byte characters those will be double utf-8 encoded in Solr.pm. My suggestion is to encode the value when creating the Field objects, but only if the value is not already encoded. See patch. All tests pass on my computer (Ubuntu Jaunty).
Subject: webservice-solr-utf8.patch
diff --git a/lib/WebService/Solr.pm b/lib/WebService/Solr.pm index 3fac899..14f1af1 100644 --- a/lib/WebService/Solr.pm +++ b/lib/WebService/Solr.pm @@ -2,7 +2,6 @@ package WebService::Solr; use Moose; -use Encode qw(encode); use URI; use LWP::UserAgent; use WebService::Solr::Response; @@ -150,7 +149,7 @@ sub _send_update { my $req = HTTP::Request->new( POST => $url, HTTP::Headers->new( Content_Type => 'text/xml; charset=utf-8' ), - '<?xml version="1.0" encoding="UTF-8"?>' . encode('utf8', $xml) + '<?xml version="1.0" encoding="UTF-8"?>' . $xml ); my $http_response = $self->agent->request($req); diff --git a/lib/WebService/Solr/Field.pm b/lib/WebService/Solr/Field.pm index a851e33..5fa6409 100644 --- a/lib/WebService/Solr/Field.pm +++ b/lib/WebService/Solr/Field.pm @@ -14,6 +14,9 @@ sub BUILDARGS { my ( $self, $name, $value, $opts ) = @_; $opts ||= {}; + utf8::encode($value) + unless utf8::is_utf8($value) || !defined $value; + return { name => $name, value => $value, %$opts }; }
I see now that the previous patch i uploaded was not correct and failed in some cases. This new patch succeeds with the current tests and encodes only the field values that is represented as UTF-8 in Perl.
diff -u -r WebService-Solr-0.06/lib/WebService/Solr/Field.pm WebService-Solr-0.06.new/lib/WebService/Solr/Field.pm --- WebService-Solr-0.06/lib/WebService/Solr/Field.pm 2009-03-03 02:52:15.000000000 +0100 +++ WebService-Solr-0.06.new/lib/WebService/Solr/Field.pm 2009-06-30 09:03:44.000000000 +0200 @@ -22,7 +22,10 @@ my $gen = XML::Generator->new( ':std', escape => 'always,even-entities' ); my %attr = ( $self->boost ? ( boost => $self->boost ) : () ); - return $gen->field( { name => $self->name, %attr }, $self->value ); + my $value = $self->value; + utf8::encode($value) if utf8::is_utf8($value); + + return $gen->field( { name => $self->name, %attr }, $value ); } no Moose; diff -u -r WebService-Solr-0.06/lib/WebService/Solr.pm WebService-Solr-0.06.new/lib/WebService/Solr.pm --- WebService-Solr-0.06/lib/WebService/Solr.pm 2009-05-07 20:14:51.000000000 +0200 +++ WebService-Solr-0.06.new/lib/WebService/Solr.pm 2009-06-30 08:50:59.000000000 +0200 @@ -150,7 +150,7 @@ my $req = HTTP::Request->new( POST => $url, HTTP::Headers->new( Content_Type => 'text/xml; charset=utf-8' ), - '<?xml version="1.0" encoding="UTF-8"?>' . encode('utf8', $xml) + '<?xml version="1.0" encoding="UTF-8"?>' . $xml ); my $http_response = $self->agent->request($req);