Subject: | url() returns incorrect results if PATH_INFO contains URL-encoded characters |
This code in CGI.pm's url() routine is incorrect:
my $uri = $rewrite && $request_uri ? $request_uri :
$script_name;
$uri =~ s/\?.*$//; #
remove query string
$uri =~ s/$path$// if defined $path; #
remove path
For the example URL <http://site/script.cgi/x%2By>, PATH_INFO will
contain "/x+y", not "/x%2By". PATH_INFO is documented as a decoded
version of the string appearing in the path info section of the URL
(e.g. at <http://hoohoo.ncsa.uiuc.edu/cgi/env.html> and
<http://cgi-spec.golux.com/draft-coar-cgi-v11-03-clean.html#6.1.6>).
However, Apache's REQUEST_URI environment variable contains the URL
without any special decoding done (documentation such as
<http://httpd.apache.org/docs/2.0/mod/mod_setenvif.html> says nothing
about any decoding, and this can be experimentally verified for Apache
1.x at places like
<http://www.imasy.or.jp/~i16/cgi-ml/printenv.cgi/x%2By> and for Apache
2.x at <http://www.uoregon.edu/~jblick/printenv.cgi/x%2By>).
Therefore, if there are any URL-encoded characters in the PATH_INFO part
of the URL, the $uri =~ s/$path$// code will unexpectedly fail to remove
the path info and it will incorrectly be appended to the URL in the
value returned by url().