Skip Menu |

This queue is for tickets about the Geo-Coder-Google CPAN distribution.

Report information
The Basics
Id: 122822
Status: resolved
Priority: 0/
Queue: Geo-Coder-Google

People
Owner: Nobody in particular
Requestors: beasley [...] web.de
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.18
Fixed in: 0.18_01



Subject: 400 Bad Request when using utf8 in location.
I am using Geo::Coder::Google 0.18, that's latest as of now on Perl 5.20. The following program results in `Google Maps API returned error: 400 Bad Request at geo_bug.pl` use v5.20; use strict; use utf8; use Geo::Coder::Google; my $geocoder = Geo::Coder::Google->new( apiver => 3, key => 'AIzaSyDhg_MRCJvwFBYP56k65uf_HVC2iFjjWmU' ); $geocoder->geocode( location => 'Kielstraße 23, 70123, Germany' ); When using a `key` parameter string with the utf8 marker set the library creates requests with invalid characters in it as soon as the location that is to be geocoded contains unicode chars. Removing lines 77-79 in Google/V3.pm fixes the issue. 77 if (Encode::is_utf8($location)) { 78 $location = Encode::encode_utf8($location); 79 } I am not sure though which other use-cases these lines were written for. What follows is an explanation of what I think goes wrong when the above lines are *not* removed. Geo/Coder/Google/V3.pm lines 77-79 UTF8 encode the location parameter if it has the utf8 marker. This causes that the query parameters passed to the URI object later on can be of mixed encoding status (some encoded already, some not yet). That in turn makes URI double encode the already encoded parameters. If I interpret https://metacpan.org/pod/URI#BUGS correctly, URI depends on the UTF8 marker on strings to determine whether to encode the string as UTF8 prior to percent encoding or not. The behavior I observe, is that when calling $uri->query_form(%hash) and *any* of the values in hash have the utf8 marker set, then *all* hash values are taken to be utf8 encoded already. If *none* of them have the utf8 marker set, then *all* hash values are utf8 encoded prior to percent encoding. The following program demonstrates that behavior. The erroneous param should be encoded as "x%C3%9Fz", but is encoded as "x%C3%83%C2%9Fz". use v5.20; use strict; use Encode; use URI; use Devel::Peek; my $uri = URI->new("https://maps.googleapis.com/maps/api/geocode/json"); # URI gives the correct result when using the top pair or the bottom pair. # It breaks when mixing decoded with non decoded params as below. my %params = ( #erroneous => decode('utf8', 'xßz'), dummy => decode('utf8', 'abc'), # sets the utf8 marker erroneous => 'xßz', #dummy => 'abc', ); $uri->query_form(%params); Dump( %params ); say 'url: ' . $uri->as_string;
I have shipped 0.18_01 to cpan, can you test this? related commits on github: https://github.com/arcanez/geo-coder-google/commit/69093d129f0d1c0217f41157fd6cee2d55ccaed7 POSTing upload for Geo-Coder-Google-0.18_01.tar.gz to https://pause.perl.org/pause/authenquery PAUSE add message sent ok [200] On Wed Aug 16 13:17:04 2017, pzim@posteo.de wrote: Show quoted text
> I am using Geo::Coder::Google 0.18, that's latest as of now on Perl > 5.20. > > The following program results in `Google Maps API returned error: 400 > Bad Request at geo_bug.pl` > > use v5.20; > use strict; > use utf8; > use Geo::Coder::Google; > > my $geocoder = Geo::Coder::Google->new( apiver => 3, key => > 'AIzaSyDhg_MRCJvwFBYP56k65uf_HVC2iFjjWmU' ); > $geocoder->geocode( location => 'Kielstraße 23, 70123, Germany' ); > > > When using a `key` parameter string with the utf8 marker set the > library creates requests with invalid characters in it as soon as > the location that is to be geocoded contains unicode chars. > > Removing lines 77-79 in Google/V3.pm fixes the issue. > > 77 if (Encode::is_utf8($location)) { > 78 $location = Encode::encode_utf8($location); > 79 } > > I am not sure though which other use-cases these lines were written > for. > > > What follows is an explanation of what I think goes wrong when the > above lines are *not* removed. > > Geo/Coder/Google/V3.pm lines 77-79 UTF8 encode the location parameter > if it has the utf8 marker. > This causes that the query parameters passed to the URI object later > on can be of mixed encoding status (some encoded already, some not > yet). > That in turn makes URI double encode the already encoded parameters. > > If I interpret https://metacpan.org/pod/URI#BUGS correctly, URI > depends on the UTF8 marker on strings to determine whether to encode > the string as UTF8 > prior to percent encoding or not. > The behavior I observe, is that when calling $uri->query_form(%hash) > and *any* of the values in hash have the utf8 marker set, then *all* > hash values are taken to > be utf8 encoded already. If *none* of them have the utf8 marker set, > then *all* hash values are utf8 encoded prior to percent encoding. > > The following program demonstrates that behavior. The erroneous param > should be encoded as "x%C3%9Fz", but is encoded as "x%C3%83%C2%9Fz". > > use v5.20; > use strict; > use Encode; > use URI; > use Devel::Peek; > > my $uri = URI- > # URI gives the correct result when using the top pair or the bottom > pair. > # It breaks when mixing decoded with non decoded params as below. > my %params = ( > #erroneous => decode('utf8', 'xßz'), > dummy => decode('utf8', 'abc'), # sets the utf8 marker > > erroneous => 'xßz', > #dummy => 'abc', > ); > $uri->query_form(%params); > > Dump( %params ); > say 'url: ' . $uri->as_string;
From: beasley [...] web.de
Tested. Broken in 0.18, works in 0.18_01. Thanks! Am Do 17. Aug 2017, 13:37:51, arcanez schrieb: Show quoted text
> I have shipped 0.18_01 to cpan, can you test this? > > related commits on github: https://github.com/arcanez/geo-coder- > google/commit/69093d129f0d1c0217f41157fd6cee2d55ccaed7 > > POSTing upload for Geo-Coder-Google-0.18_01.tar.gz to > https://pause.perl.org/pause/authenquery > PAUSE add message sent ok [200] > > On Wed Aug 16 13:17:04 2017, pzim@posteo.de wrote:
> > I am using Geo::Coder::Google 0.18, that's latest as of now on Perl > > 5.20. > > > > The following program results in `Google Maps API returned error: 400 > > Bad Request at geo_bug.pl` > > > > use v5.20; > > use strict; > > use utf8; > > use Geo::Coder::Google; > > > > my $geocoder = Geo::Coder::Google->new( apiver => 3, key => > > 'AIzaSyDhg_MRCJvwFBYP56k65uf_HVC2iFjjWmU' ); > > $geocoder->geocode( location => 'Kielstraße 23, 70123, Germany' ); > > > > > > When using a `key` parameter string with the utf8 marker set the > > library creates requests with invalid characters in it as soon as > > the location that is to be geocoded contains unicode chars. > > > > Removing lines 77-79 in Google/V3.pm fixes the issue. > > > > 77 if (Encode::is_utf8($location)) { > > 78 $location = Encode::encode_utf8($location); > > 79 } > > > > I am not sure though which other use-cases these lines were written > > for. > > > > > > What follows is an explanation of what I think goes wrong when the > > above lines are *not* removed. > > > > Geo/Coder/Google/V3.pm lines 77-79 UTF8 encode the location parameter > > if it has the utf8 marker. > > This causes that the query parameters passed to the URI object later > > on can be of mixed encoding status (some encoded already, some not > > yet). > > That in turn makes URI double encode the already encoded parameters. > > > > If I interpret https://metacpan.org/pod/URI#BUGS correctly, URI > > depends on the UTF8 marker on strings to determine whether to encode > > the string as UTF8 > > prior to percent encoding or not. > > The behavior I observe, is that when calling $uri->query_form(%hash) > > and *any* of the values in hash have the utf8 marker set, then *all* > > hash values are taken to > > be utf8 encoded already. If *none* of them have the utf8 marker set, > > then *all* hash values are utf8 encoded prior to percent encoding. > > > > The following program demonstrates that behavior. The erroneous param > > should be encoded as "x%C3%9Fz", but is encoded as "x%C3%83%C2%9Fz". > > > > use v5.20; > > use strict; > > use Encode; > > use URI; > > use Devel::Peek; > > > > my $uri = URI- > > # URI gives the correct result when using the top pair or the bottom > > pair. > > # It breaks when mixing decoded with non decoded params as below. > > my %params = ( > > #erroneous => decode('utf8', 'xßz'), > > dummy => decode('utf8', 'abc'), # sets the utf8 marker > > > > erroneous => 'xßz', > > #dummy => 'abc', > > ); > > $uri->query_form(%params); > > > > Dump( %params ); > > say 'url: ' . $uri->as_string;