Skip Menu |

This queue is for tickets about the Catalyst-Runtime CPAN distribution.

Report information
The Basics
Id: 101556
Status: resolved
Priority: 0/
Queue: Catalyst-Runtime

People
Owner: Nobody in particular
Requestors: DAKKAR [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 5.90082
Fixed in: 5.90084



Subject: correctly declared non-utf8 parameters in request break Catalyst
Attached test case. HTTP allows sending mulitpart/form-data requests in which each parameter declares its own encoding via a local content-type header. Catalyst ignores that header, and always tries to decode using the application's encoding. From IRC: <ilmari> dakkar: yes, it should respect the content-type (and charset parameter) of each part, and only decode text ones <ilmari> Catalyst::Request needs an analogue to Catalyst::Response's ->encodable_response <ilmari> actually, that needs to happen at the part level for multipart/form-data <ilmari> but HTTP::Body::MultiPart discards the headers :(
Subject: en.pl
#!/usr/bin/env perl use strict; use warnings; package MyApp::Controller::Root { use Moose; BEGIN { extends 'Catalyst::Controller' }; sub index :Path(/) :Args(0) { } }; package MyApp { use Moose; extends 'Catalyst'; __PACKAGE__->setup('-Debug'); }; use Plack::Test; use HTTP::Request::Common; use Data::Dumper; my $test = Plack::Test->create(MyApp->psgi_app); my $req = POST '/', Content_Type => 'form-data', Content => [ foo => [ undef,'', Content_Type=>'application/octet-stream', Content => "\x89foobar", ], ]; print Dumper $req->as_string; my $res = $test->request($req); print Dumper $res;
On Thu Jan 15 07:00:41 2015, DAKKAR wrote: Show quoted text
> Attached test case. > HTTP allows sending mulitpart/form-data requests in which each > parameter declares its own encoding via a local content-type header. > Catalyst ignores that header, and always tries to decode using the > application's encoding. > > From IRC: > > <ilmari> dakkar: yes, it should respect the content-type (and charset > parameter) of each part, and only decode text ones > <ilmari> Catalyst::Request needs an analogue to Catalyst::Response's > ->encodable_response > <ilmari> actually, that needs to happen at the part level for > multipart/form-data > <ilmari> but HTTP::Body::MultiPart discards the headers :(
There's a test case that I thought covered this. https://metacpan.org/source/JJNAPIORK/Catalyst-Runtime-5.90082/t/utf_incoming.t#L360 could you please make the example code more explicit? I am not clear what you are expecting (add some Test::More type asserts or example of expected output.
Show quoted text
> There's a test case that I thought covered this. > > https://metacpan.org/source/JJNAPIORK/Catalyst-Runtime-5.90082/t/utf_incoming.t#L360
It does, for uploads (which are recognised as having a filename in (I believe) content-disposition. If there's no file name, the content is expected to be in the application's encoding, even if it is declared differently. Attached: a patch to utf_incoming.t that adds the failing test
Subject: utf_incoming.patch
diff --git i/t/utf_incoming.t w/t/utf_incoming.t index 76eaa87..05a9bc3 100644 --- i/t/utf_incoming.t +++ w/t/utf_incoming.t @@ -4,7 +4,7 @@ use strict; use Test::More; use HTTP::Request::Common; use HTTP::Message::PSGI (); -use Encode 2.21 'decode_utf8', 'encode_utf8'; +use Encode 2.21 'decode_utf8', 'encode_utf8', 'encode'; use File::Spec; use JSON::MaybeXS; @@ -187,6 +187,12 @@ use JSON::MaybeXS; $c->res->from_psgi_response( ref($c)->to_app->($env)); } + sub echo_arg :Local { + my ($self, $c) = @_; + $c->response->content_type('text/plain'); + $c->response->body($c->req->body_parameters->{arg}); + } + package MyApp; use Catalyst; @@ -427,6 +433,22 @@ SKIP: { is $res->content_charset, 'UTF-8'; } +{ + my $string = 'áæé'; + ok my $req = POST '/root/echo_arg', + Content_Type => 'form-data', + Content => [ + arg => [ + undef, '', + 'Content-Type' =>'text/plain; charset=iso-8859-1', + Content => encode('iso-8859-1',$string), + ], + ]; + + ok my $res = request $req; + is decode_utf8($res->content), $string; +} + ## should we use binmode on filehandles to force the encoding...? ## Not sure what else to do with multipart here, if docs are enough...
There's a patch for this, to update HTTP-Body http://git.shadowcat.co.uk/gitweb/gitweb.cgi?p=catagits/HTTP-Body.git;a=tree;h=refs/heads/jnap/bodyparts;hb=jnap/bodyparts if that is good I will patch catalyst to use this new info so that we can properly decode incoming On Thu Jan 15 10:15:48 2015, DAKKAR wrote: Show quoted text
> > There's a test case that I thought covered this. > > > > https://metacpan.org/source/JJNAPIORK/Catalyst-Runtime- > > 5.90082/t/utf_incoming.t#L360
> > It does, for uploads (which are recognised as having a filename in (I > believe) content-disposition. > > If there's no file name, the content is expected to be in the > application's encoding, even if it is declared differently. > > Attached: a patch to utf_incoming.t that adds the failing test
The patch looks fine, but: - the documentation for part_data says «raw byte size of the content (before encoding into a characterset)» which is backwards. You *decode* bytes using an encoding (An encoding in a pair of functions between bytes and charaters; a charset is a set of characters. Historically some charset came with their own corresponding encoding, but that has not been generally true or useful for more than a decade (what MIME calls "charset" has always been an encoding)) - the tests don't seem to check that headers are actually preserved
Howdy, New release of Catalyst => http://metacpan.org/release/JJNAPIORK/Catalyst-Runtime-5.90084 aimed at this problem. Could all concerned parties please review and if the problem can be considered solve, go ahead and close this ticket. thanks! On Thu Jan 15 07:00:41 2015, DAKKAR wrote: Show quoted text
> Attached test case. > HTTP allows sending mulitpart/form-data requests in which each > parameter declares its own encoding via a local content-type header. > Catalyst ignores that header, and always tries to decode using the > application's encoding. > > From IRC: > > <ilmari> dakkar: yes, it should respect the content-type (and charset > parameter) of each part, and only decode text ones > <ilmari> Catalyst::Request needs an analogue to Catalyst::Response's > ->encodable_response > <ilmari> actually, that needs to happen at the part level for > multipart/form-data > <ilmari> but HTTP::Body::MultiPart discards the headers :(