Skip Menu |

This queue is for tickets about the URI CPAN distribution.

Report information
The Basics
Id: 52707
Status: new
Priority: 0/
Queue: URI

People
Owner: Nobody in particular
Requestors: scop [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 1.51
Fixed in: (no value)



Subject: URI::host may return tainted data when called for the first time
I'm trying to track down an obscure taint mode issue, probably in URI. In short, for a URI object that had an explicit port in the string it was constructed from, the first time $uri->host() is called it sometimes (!) returns tainted data, and non-tainted when called again after that. When it happens, it happens even if the string the URI was constructed from was not tainted. I've tracked it deep down in URI::_server::host. The line that causes it is: $old =~ s/:\d+$//; # remove the port If I replace the \d with [[:digit:]] or [0-9] or \w or . , the problem does not occur. Longer explanation: This occurs with the W3C Markup Validator when validating some HTML 5 documents and when configured to POST the HTML 5 document to an external HTML 5 validator using a URI that has a port. The issue is that for some retrieved documents, the POST fails with "500 Insecure dependency in connect while running with -T switch". I am very confused about this, because it happens only for *some* validated documents, not all. I don't see how the contents of the validated (internally POSTed to the HTML 5 validator URL) documents would be relevant. They're all POSTed to the same URL/host/port. I have also failed to create a small reproducer, but using this version of the markup validator: http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.749&content-type=text/x-cvsweb-markup , configured to POST the HTML 5 markup to for example to http://qa-dev.w3.org:8888/html5/ or http://validator.nu:80/ , and validating the document at http://htmlex.met.cz/ the problem occurs on two different systems I have access to. As said it does not happen with all documents, for example validating the content at http://validator.nu/ instead of http://htmlex.met.xz/ it does not happen. Also it does not happen if the HTML 5 validator where the content is POSTed is configured to be http://validator.nu/ (without the :80 in the URL). I have found a couple of different workarounds, for none of which I can tell why exactly they work around the issue, but they do: a) Using $url->query("out=xml") instead of $url->query_form(out => "xml") in html5_validate() (around line 1167) in the validator code (see above dev.w3.org URL). b) Placing a throwaway $uri->host() call after the query_form in the validator code in html5_validate() (again, see above dev.w3.org URL). The string $CFG->{External}->{HTML5} from which the URI object is created is not tainted.