Skip Menu |

This queue is for tickets about the Geo-Coder-Bing CPAN distribution.

Report information
The Basics
Id: 57149
Status: resolved
Priority: 0/
Queue: Geo-Coder-Bing

People
Owner: Nobody in particular
Requestors: SREZIC [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.06
Fixed in: (no value)



Subject: Encoding problems
The attached script shows the following output on my system (FreeBSD 8.0 with either perl 5.8.8 or perl 5.12.0): $ perl5.12.0 /tmp/bing.t not ok 1 # Failed test at /tmp/bing.t line 9. # got: 'Schmöckwitz' # expected: 'Schmöckwitz' SV = PV(0x287c8588) at 0x2889e680 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x2897f800 "Schm\303\203\302\266ckwitz"\0 [UTF8 "Schm\x{c3}\x{b6}ckwitz"] CUR = 14 LEN = 16 1..1 # Looks like you failed 1 test of 1. So it looks like the result is somehow double-encoded utf8. Regards, Slaven
Subject: bing.t
#!/usr/bin/perl -w use Test::More 'no_plan'; use Geo::Coder::Bing; use Devel::Peek; { my $location = Geo::Coder::Bing->new->geocode('Anglerweg, Schmöckwitz'); is($location->{Address}->{Locality}, 'Schmöckwitz') or Dump $location->{Address}->{Locality}; } __END__
On Sun May 02 16:20:08 2010, SREZIC wrote: Show quoted text
> The attached script shows the following output on my system (FreeBSD 8.0 > with either perl 5.8.8 or perl 5.12.0): > > $ perl5.12.0 /tmp/bing.t > not ok 1 > # Failed test at /tmp/bing.t line 9. > # got: 'Schmöckwitz' > # expected: 'Schmöckwitz' > SV = PV(0x287c8588) at 0x2889e680 > REFCNT = 1 > FLAGS = (POK,pPOK,UTF8) > PV = 0x2897f800 "Schm\303\203\302\266ckwitz"\0 [UTF8 > "Schm\x{c3}\x{b6}ckwitz"] > CUR = 14 > LEN = 16 > 1..1 > # Looks like you failed 1 test of 1. > > So it looks like the result is somehow double-encoded utf8. > > Regards, > Slaven
Thanks for reporting this. I won't have a chance to look at this in the near future, but I'll gladly accept a patch.
CC: SREZIC [...] cpan.org
Subject: Re: [rt.cpan.org #57149] Encoding problems
Date: Wed, 05 May 2010 08:11:53 +0200
To: bug-Geo-Coder-Bing [...] rt.cpan.org
From: Slaven Rezic <slaven [...] rezic.de>
"gray via RT" <bug-Geo-Coder-Bing@rt.cpan.org> writes: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=57149 > > > On Sun May 02 16:20:08 2010, SREZIC wrote:
>> The attached script shows the following output on my system (FreeBSD 8.0 >> with either perl 5.8.8 or perl 5.12.0): >> >> $ perl5.12.0 /tmp/bing.t >> not ok 1 >> # Failed test at /tmp/bing.t line 9. >> # got: 'Schmöckwitz' >> # expected: 'Schmöckwitz' >> SV = PV(0x287c8588) at 0x2889e680 >> REFCNT = 1 >> FLAGS = (POK,pPOK,UTF8) >> PV = 0x2897f800 "Schm\303\203\302\266ckwitz"\0 [UTF8 >> "Schm\x{c3}\x{b6}ckwitz"] >> CUR = 14 >> LEN = 16 >> 1..1 >> # Looks like you failed 1 test of 1. >> >> So it looks like the result is somehow double-encoded utf8. >> >> Regards, >> Slaven
> > Thanks for reporting this. I won't have a chance to look at this in the > near future, but I'll gladly accept a patch. >
Attached a diff against the current git version. The problem is that you are using "decoded_content" on the response, but decoded_content only to charset encoding for text content, but not for application content, and the content type here is application/json. The patch also includes an author test. I did not made it into an "official" test, because it's doing a network connection and one never knows if users have always a running connection. Regards, Slaven
diff --git a/lib/Geo/Coder/Bing.pm b/lib/Geo/Coder/Bing.pm index 1f82b11..f102000 100644 --- a/lib/Geo/Coder/Bing.pm +++ b/lib/Geo/Coder/Bing.pm @@ -84,6 +84,9 @@ sub geocode { my $content = $res->decoded_content; return unless $content; + $content = Encode::decode('utf-8', $content) + if $res->content_type =~ m{application/json;\scharset=utf-8}i; + # Workaround invalid data. $content =~ s[ \}\.d $ ][}]x; diff --git a/xt/author/utf8.t b/xt/author/utf8.t new file mode 100755 index 0000000..3911927 --- /dev/null +++ b/xt/author/utf8.t @@ -0,0 +1,29 @@ +#!/usr/bin/perl -w +# -*- perl -*- + +use strict; +use warnings; +use Data::Dumper; +use Devel::Peek; +use Encode; +use Geo::Coder::Bing; +use Test::More tests => 3; + +my $VERBOSE = 0; + +my $geocoder = Geo::Coder::Bing->new; + +my $address = "Rübländerstraße, Berlin, Germany"; +Dump $address if $VERBOSE; +my $location = $geocoder->geocode($address); +ok $location, "Supplied string without utf8 flag"; +warn Dumper $location if $VERBOSE; + +my $utf8_address = decode("iso-8859-1", $address); # force utf8 flag +is $utf8_address, $address; +Dump $utf8_address if $VERBOSE; +$location = $geocoder->geocode($utf8_address); +ok $location, "Supplied string with utf8 flag"; +warn Dumper $location if $VERBOSE; + +__END__
-- Slaven Rezic - slaven <at> rezic <dot> de tkrevdiff - graphical display of diffs between revisions (RCS, CVS or SVN) http://ptktools.sourceforge.net/#tkrevdiff
On Wed May 05 02:13:24 2010, slaven@rezic.de wrote: Show quoted text
> Attached a diff against the current git version. The problem is that you > are using "decoded_content" on the response, but decoded_content only to > charset encoding for text content, but not for application content, and > the content type here is application/json. The patch also includes an > author test. I did not made it into an "official" test, because it's > doing a network connection and one never knows if users have always a > running connection.
Thanks for tracking down the cause. I did't use your patch, though, because I think it was easier to change the content type than explicitly decoding the content and the test wasn't relevant to this issue (it appeared to be similar to what I already have in xt/live.t). Thanks again.