Here is the code (diff with empty catalyst app):
===
$ git diff
diff --git a/MyApp/lib/MyApp.pm b/MyApp/lib/MyApp.pm
index 2d9c4cb..0cbb730 100644
--- a/MyApp/lib/MyApp.pm
+++ b/MyApp/lib/MyApp.pm
@@ -37,6 +37,7 @@ our $VERSION = '0.01';
__PACKAGE__->config(
name => 'MyApp',
+ encoding => undef,
# Disable deprecated behavior needed by old applications
disable_component_resolution_regex_fallback => 1,
enable_catalyst_header => 1, # Send X-Catalyst header
diff --git a/MyApp/lib/MyApp/Controller/Root.pm b/MyApp/lib/MyApp/Controller/Root.pm
index dd7e033..1607e05 100644
--- a/MyApp/lib/MyApp/Controller/Root.pm
+++ b/MyApp/lib/MyApp/Controller/Root.pm
@@ -32,7 +32,7 @@ sub index :Path :Args(0) {
my ( $self, $c ) = @_;
# Hello World
- $c->response->body( $c->welcome_message );
+ $c->response->body( $Catalyst::VERSION . " " . $c->uri_for("index", { test => "\xEF\xF0\xE8\xE2\xE5\xF2" } ) );
}
=head2 default
===
Bytes "\xEF\xF0\xE8\xE2\xE5\xF2" are word "hello" in Russian ("привет"), in Windows-1251 encoding.
What index action prints for different catalyst versions:
5.90102
http://localhost:3000/index?test=%C3%AF%C3%B0%C3%A8%C3%A2%C3%A5%C3%B2
5.90085
http://localhost:3000/index?test=%C3%AF%C3%B0%C3%A8%C3%A2%C3%A5%C3%B2
5.90079
http://localhost:3000/index?test=%EF%F0%E8%E2%E5%F2
For 5.90079 bytes looks exactly like original. For 5.90085 original bytes treated as text in Latin1 encoding and encoded as UTF-8.
Note1: test=%EF%F0%E8%E2%E5%F2 seems is denied by modern RFC, but allowed in older). I.e. in modern RFC only UTF-8 encoded text is allowed.
Note2: 5.90079 implementation of uri_for indeed *was* broken (due to use of is_utf8).
However, I still think that catalyst should produce %EF%F0%E8%E2%E5%F2 for apps with encoding=>undef, since that is the way how text is treated in such application: text used "as is", not as perl text string. And 5.90085 breaks compatibility and consistency (when encoding => undef in other places text is not reencoded from latin1-to-utf8)
5.90079 used to use is_utf8 and it was broken, however right way is to use only encoding=>undef flag and ignore is_utf8 flag
On Thu Oct 29 23:47:00 2015, JJNAPIORK wrote:
Show quoted text> Hi,
>
> I'd like to help, can you verify this is still a problem (there's been
> a ton of little tweaks) and if so can you give me more information
> such as an exact something that used to work and now doesn't.
> Thanks!
>
> On Thu May 14 04:03:32 2015, vsespb wrote:
> > Similar to this
https://rt.cpan.org/Public/Bug/Display.html?id=103063
> >
> > but now $c->uri_for assumes that we work in UTF-8 world and encodes
> > all data in UTF-8:
> >
> > $param = encode_utf8($param);
> >
> > even with optional encoding=>undef and do_not_decode_query=>1