Subject: | url() incorrectly unescapes request_uri |
My CGI script gets a request such as:
GET /~rrt/Software/DarkGlass/Why%20DarkGlass%3F HTTP/1.1
Note the "%3F" at the end of the URL: it's an escaped question mark.
My script then asks for the URL using "url()", and gets:
"Software/DarkGlass/Why DarkGlass"
The question mark has been stripped off, because after unescaping in the
line:
my $request_uri = unescape($self->request_uri) || '';
the code then strips the query off:
my $uri = $rewrite && $request_uri ? $request_uri :
$script_name;
$uri =~ s/\?.*$//s; #
remove query string
But this is wrong: if a query string is passed, it uses an unescaped
question mark. The "unescape" routine is itself undocumented, being part
of CGI::Util which is for internal use only. I'm not an expert, so I
don't know if some unescaping should be being done, but it seems that at
the very least, question marks should not be unescaped at that point.
However, since the question mark denoting a query string should be a
literal question mark character in the original request URI, it seems to
be that it should be safe to remove the query string before unescaping
the URI.
The code in question has not been changed since the start of CGI.pm's
git history, so I can't find any relevant commit logs.
If I'm correct, I'm still not exactly sure what a minimum patch would
be, as while the obvious thing to do would be to move the call of
unescape to after the stripping of the query string, that would mean
applying unescape to the value of $script_name in some circumstances,
which might be OK, or might not, I don't know. But I guess $script_name
shouldn't contain a query string, so in fact, stripping the query string
should only be needed in the case that $uri is set to $request_uri, so
it should be possible to rewrite the relevant lines from:
my $request_uri = unescape($self->request_uri) || '';
my $query_str = $self->query_string;
my $rewrite_in_use = $request_uri && $request_uri !~ /^\Q$script_name/;
my $uri = $rewrite && $request_uri ? $request_uri :
$script_name;
$uri =~ s/\?.*$//s; #
remove query string
to:
my $request_uri = $self->request_uri || '';
my $query_str = $self->query_string;
$request_uri =~ s/\?.*$//s; #
remove query string
$request_uri = unescape($request_uri);
my $rewrite_in_use = $request_uri && $request_uri !~ /^\Q$script_name/;
my $uri = $rewrite && $request_uri ? $request_uri:
$script_name;