On Wed Jun 15 01:54:20 2016, SPROUT wrote:
Show quoted text> On Wed Jun 15 01:13:36 2016, SPROUT wrote:
> > $ perl5.24.0 -Ilib -le 'use URI; print new_abs URI "#anchor",
> > "data:text/html,foo"'
> > data:#anchor
> >
> > Expected output:
> > data:text/html,foo#anchor
>
> Here is a patch. It is based on observing how web browsers behave.
Wait. This patch doesn’t fully solve the problem.
Despite what I said in my original post (some was wrong, based on reading the source of multiple modules and not remembering which was which), HTTP::Response::base does this:
# if $base is undef here, the return value is effectively
# just a copy of $self->request->uri.
return $HTTP::URI_CLASS->new_abs($base, $req->uri);
which translates into URI->new_abs(undef, "data:,foo"). But that gives ‘data:’, which is not just a copy of "data:,foo".
URI::new_abs is as follows:
sub new_abs
{
my($class, $uri, $base) = @_;
$uri = $class->new($uri, $base);
$uri->abs($base);
}
So it does URI->new(undef, "data:,foo") in this case, which is interpreted as ->new(undef, "data"), since the second argument is just the scheme.
That translates into data:, since you cannot have an empty data: URL. (It doesn’t make sense.)
So it seems URI->new("data:")->abs("data:,foo") needs to treat "data:" as undef and return "data:,foo". This new patch follows that approach.
I can’t say whether it is the best solution (after all, even web browsers don’t behave the same way when a data: URL contains <a href=foo>), but it works for all the cases I can think of that do make sense.