Subject: | uri_for(_action) doesn't percent-encode arguments correctly |
Date: | Mon, 21 Dec 2015 15:32:26 +0100 |
To: | bug-Catalyst-Runtime [...] rt.cpan.org |
From: | Ulrich Klauer <ulrich [...] chirlu.de> |
Good day,
Catalyst 5.90103 here, on Perl 5.14.2
I'm using a Catalyst action with these attributes: PathPart('tag')
Args(1). The single argument is a tag name chosen by some user,
essentially free-form text. When I need the URI for such a tag page, I
call uri_for_action like this:
uri_for_action('/tag', [ $some_tag_name ]);
Most of the time, this works fine; slashes, non-ASCII characters etc.
are percent-encoded. However, if $some_tag_name contains a percent
sign (e.g. "50% ready"), the percent sign is itself not encoded, and
the generated URI looks like this:
http://www.example.org/tag/100%%20ready
Many browsers accept this and will try to follow the link, but nginx
rejects the request (400 Bad Request).
It gets worse if the percent sign happens to be followed by two
characters that can be interpreted as hexadecimal digits; e.g., in
some languages you would write "%50" instead of "50%". uri_for_action
will generate the URI .../tag/%50, and Catalyst will then interpret a
request for this URI as one for .../tag/P (which may or may not exist,
but in any case wasn't the intended target).
Of course, the correct URIs would be .../tag/100%25%20ready and
.../tag/%2550, respectively.
There are more characters that are reserved in URIs, but not encoded
by uri_for, such as the colon (:), the ampersand (&) and the single
quote ('). I'm not sure if they can cause problems, too.
Couldn't Catalyst simply run the whole argument through
URI::Escape::uri_escape_utf8? That should correctly encode all
reserved characters, as well as any non-ASCII characters.
Ulrich