Skip Menu |

This queue is for tickets about the WWW-Facebook-API CPAN distribution.

Report information
The Basics
Id: 32500
Status: resolved
Priority: 0/
Queue: WWW-Facebook-API

People
Owner: unobe [...] cpan.org
Requestors: ryan [...] innerfence.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: UTF-8 input
Date: Tue, 22 Jan 2008 00:16:23 -0800
To: bug-www-facebook-api [...] rt.cpan.org
From: Ryan D Johnson <ryan [...] innerfence.com>
Hi, David. Looks like things need to get run through Encode::encode_utf8 before processing (md5_hex fails hard, then facebook blows up). Simply running the input values through encode_utf8 makes everything work swimmingly. The specific API I'm using is photos->upload, and the caption field is where I'm hitting the issue. But it seems to me that it wouldn't harm the params to run them all through this, right? Perl strings are utf-8 internally, and all strings have a utf-8 representation, so this just converts them to octets that the lower-level processing is all happy with. Let me know if you need any more info. I was able to work around this by manually encoding my the specific field I was having trouble with. Seems like it would be simple to add this to maybe WWW::Facebook::API::_format_and_check_params. Thanks for writing this really useful module! /rdj
Subject: Re: [rt.cpan.org #32500] UTF-8 input
Date: Tue, 29 Jan 2008 23:17:57 -0800
To: Ryan D Johnson via RT <bug-WWW-Facebook-API [...] rt.cpan.org>
From: David Romano <unobe [...] cpan.org>
Hi Ryan, Ryan D Johnson via RT wrote on Tue, Jan 22, 2008 at 12:16:26AM PST: Show quoted text
> Looks like things need to get run through Encode::encode_utf8 before > processing (md5_hex fails hard, then facebook blows up). Simply running > the input values through encode_utf8 makes everything work swimmingly. > > The specific API I'm using is photos->upload, and the caption field is > where I'm hitting the issue. > > But it seems to me that it wouldn't harm the params to run them all > through this, right? Perl strings are utf-8 internally, and all strings > have a utf-8 representation, so this just converts them to octets that > the lower-level processing is all happy with. > > Let me know if you need any more info. I was able to work around this by > manually encoding my the specific field I was having trouble with. Seems > like it would be simple to add this to maybe > WWW::Facebook::API::_format_and_check_params.
Thank you for reporting this, and sorry it's taken a bit to get back to you! The latest version in the googlecode repo now have addition, and I'll make a release within a week if I don't hear any more feedback about the changes. Also, do you want your e-mail included in the credits for the module? - David -- 'One of the difficulties in Christian work is this question--"What do you expect to do?" You do not know what you are going to do; the only thing you know is that God knows what He is doing.' -- Oswald Chambers, My Utmost for His Highest, January 2
Subject: Re: [rt.cpan.org #32500] UTF-8 input
Date: Wed, 30 Jan 2008 01:37:44 -0800
To: bug-WWW-Facebook-API [...] rt.cpan.org
From: Ryan D Johnson <ryan [...] innerfence.com>
unobe@cpan.org via RT wrote: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=32500 > > > Hi Ryan, > Ryan D Johnson via RT wrote on Tue, Jan 22, 2008 at 12:16:26AM PST: >
>> Looks like things need to get run through Encode::encode_utf8 before >> processing (md5_hex fails hard, then facebook blows up). Simply running >> the input values through encode_utf8 makes everything work swimmingly. >>
> Thank you for reporting this, and sorry it's taken a bit to get back to > you! The latest version in the googlecode repo now have addition, and > I'll make a release within a week if I don't hear any more feedback > about the changes. Also, do you want your e-mail included in the > credits for the module? > > - David >
Hi, David. I'd be happy to have my email in the credits for the module. I see two issues with your fix: (1) The encoding is only applied if the array flattening comes into play. (2) If the input is already correctly encoded, encoding is applied anyway, effectively "double encoding" it, resulting in garbage. Thanks for pointing me at the svn repository. I've attached a diff, including adding two new tests ensuring proper encoding of unicode data and leaving alone of "already encoded" utf-8 bytes. Thanks for getting back to me. /rdj
Index: t/api.t =================================================================== --- t/api.t (revision 209) +++ t/api.t (working copy) @@ -4,8 +4,9 @@ # $Author$ # ex: set ts=8 sw=4 et ######################################################################### -use Test::More tests => 34; +use Test::More tests => 36; use WWW::Facebook::API; +use Encode qw( encode_utf8 ); use strict; use warnings; @@ -109,6 +110,16 @@ $api->call('method', %$args ); is $ids, '3,4,5,6', 'Array refs flattened'; + $args = { unichar => "\x{304b}" }; + my $unichar = q{}; + $WWW::Facebook::API::{_post_request} = sub { $unichar = $_[1]->{'unichar'}; q{} }; + $api->call('method', %$args ); + is $unichar, encode_utf8( "\x{304b}" ), 'Unicode param encoded for transmission'; + + $args = { unichar => encode_utf8( "\x{304b}" ) }; + $unichar = q{}; + $api->call('method', %$args ); + is $unichar, encode_utf8( "\x{304b}" ), 'Raw UTF-8 param left alone for transmission'; } sub redirect_fh { Index: lib/WWW/Facebook/API.pm =================================================================== --- lib/WWW/Facebook/API.pm (revision 209) +++ lib/WWW/Facebook/API.pm (working copy) @@ -15,7 +15,7 @@ use LWP::UserAgent; use Time::HiRes qw(time); use Digest::MD5 qw(md5_hex); -use Encode qw(encode_utf8); +use Encode qw(encode_utf8 is_utf8); use CGI; use CGI::Util qw(escape); @@ -415,8 +415,15 @@ # reformat arrays and add each param to digest for ( keys %{$params} ) { - next unless ref $params->{$_} eq 'ARRAY'; - $params->{$_} = encode_utf8(join q{,}, @{ $params->{$_} }); + if ( ref $params->{$_} eq 'ARRAY' ) + { + $params->{$_} = join q{,}, @{ $params->{$_} }; + } + + if ( is_utf8( $params->{$_} ) ) + { + $params->{$_} = encode_utf8( $params->{$_} ); + } } croak '_format_and_check_params must be called in list context!'
Subject: Re: [rt.cpan.org #32500] UTF-8 input
Date: Wed, 30 Jan 2008 10:00:44 -0800
To: Ryan D Johnson via RT <bug-WWW-Facebook-API [...] rt.cpan.org>
From: David Romano <unobe [...] cpan.org>
Hi Ryan, Ryan D Johnson via RT wrote on Wed, Jan 30, 2008 at 01:37:39AM PST: Show quoted text
> I see two issues with your fix: > > (1) The encoding is only applied if the array flattening comes into play.
Wow, that's a big oversight. I originally had it when md5hex was called, but then (2) would still apply. Show quoted text
> (2) If the input is already correctly encoded, encoding is applied > anyway, effectively "double encoding" it, resulting in garbage.
Totally forgot about that: thanks for catching it. Show quoted text
> Thanks for pointing me at the svn repository. I've attached a diff, > including adding two new tests ensuring proper encoding of unicode data > and leaving alone of "already encoded" utf-8 bytes.
Nice patch! I've applied it, and I'll be making a release next week. Thanks again, - David -- "Your brother's sorrows and perplexities are an absolute confusion to you. We image we understand where the other person is, until God gives us a dose of the plague of our own hearts." -- Oswald Chambers, My Utmost for His Highest, January 13