Skip Menu |

This queue is for tickets about the URI CPAN distribution.

Report information
The Basics
Id: 34309
Status: resolved
Priority: 0/
Queue: URI

People
Owner: Nobody in particular
Requestors: andrejohn.mas [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.35
Fixed in: (no value)



Subject: URI->host() not returning correct IPv6 host component
If the URI is of the form: http://[::1]/ then the returned host component by URI->host() is: [::1] this is incorrect, since the correct return value should be: ::1 Doing so otherwise will break when passed to other methods expecting a correctly formed IPv6 address.
On Thu Mar 20 22:53:45 2008, ajmas wrote: Show quoted text
> If the URI is of the form: > > http://[::1]/ > > then the returned host component by URI->host() is: > > [::1] > > this is incorrect, since the correct return value should be: > > ::1 > > Doing so otherwise will break when passed to other methods expecting a > correctly formed IPv6 > address.
TECHNICALLY the host is [::1] if you do a strict reading of the grammar host = hostname | IPv4address | IPv6reference ipv6reference = "[" IPv6address "]" but that's not very useful. I ran into this problem as well. Attached is a fix. Note that the tests had been set up specifically to test that host() did not strip the brackets, technically its a regression.
From 8d003881fb78e01d442d6ed1c74d43d97d543c3e Mon Sep 17 00:00:00 2001 From: Michael G. Schwern <schwern@pobox.com> Date: Wed, 21 Oct 2009 18:46:21 -0700 Subject: [PATCH] [rt.cpan.org 34309] Strip brackets off IPv6 hosts. Convert rfc2732 to use Test::More so it has some failure diagnostics. --- URI/_server.pm | 4 +++- t/rfc2732.t | 35 ++++++++++++++++------------------- 2 files changed, 19 insertions(+), 20 deletions(-) diff --git a/URI/_server.pm b/URI/_server.pm index 10059f0..beeb9ca 100644 --- a/URI/_server.pm +++ b/URI/_server.pm @@ -44,7 +44,8 @@ sub host } return undef unless defined $old; $old =~ s/.*@//; - $old =~ s/:\d+$//; + $old =~ s/:\d+$//; # remove the port + $old =~ s{^\[(.*)\]$}{$1}; # remove brackets around IPv6 (RFC 3986 3.2.2) return uri_unescape($old); } @@ -79,6 +80,7 @@ sub host_port return undef unless defined $old; $old =~ s/.*@//; # zap userinfo $old =~ s/:$//; # empty port does not could + $old =~ s{^\[(.*)\]}{$1}; # strip brackets around IPv6 address $old .= ":" . $self->port unless $old =~ /:/; $old; } diff --git a/t/rfc2732.t b/t/rfc2732.t index 7e462a3..54e07d4 100644 --- a/t/rfc2732.t +++ b/t/rfc2732.t @@ -1,39 +1,36 @@ #!perl -w -print "1..9\n"; +# Test URIs containing IPv6 addresses use strict; +use Test::More tests => 11; + use URI; my $uri = URI->new("http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html"); -print "not " unless $uri->as_string eq "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html"; -print "ok 1\n"; +is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html"; -print "not " unless $uri->host eq "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]"; -print "ok 2\n"; +is $uri->host, "FEDC:BA98:7654:3210:FEDC:BA98:7654:3210"; -print "not " unless $uri->host_port eq "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80"; -print "ok 3\n"; +is $uri->host_port, "FEDC:BA98:7654:3210:FEDC:BA98:7654:3210:80"; -print "not " unless $uri->port eq "80"; -print "ok 4\n"; +is $uri->port, "80"; $uri->host("host"); -print "not " unless $uri->as_string eq "http://host:80/index.html"; -print "ok 5\n"; +is $uri->as_string, "http://host:80/index.html"; $uri = URI->new("ftp://ftp:@[3ffe:2a00:100:7031::1]"); -print "not " unless $uri->as_string eq "ftp://ftp:@[3ffe:2a00:100:7031::1]"; -print "ok 6\n"; +is $uri->as_string, "ftp://ftp:@[3ffe:2a00:100:7031::1]"; + +is $uri->port, "21"; +ok !$uri->_port; -print "not " unless $uri->port eq "21" && !$uri->_port; -print "ok 7\n"; +is $uri->host("ftp"), "3ffe:2a00:100:7031::1"; -print "not " unless $uri->host("ftp") eq "[3ffe:2a00:100:7031::1]"; -print "ok 8\n"; +is $uri, "ftp://ftp:\@ftp"; -print "not " unless $uri eq "ftp://ftp:\@ftp"; -print "ok 9\n"; +$uri = URI->new("http://[::1]"); +is $uri->host, "::1"; __END__ -- 1.6.5
On Thu Mar 20 22:53:45 2008, ajmas wrote: Show quoted text
> If the URI is of the form: > > http://[::1]/ > > then the returned host component by URI->host() is: > > [::1] > > this is incorrect, since the correct return value should be: > > ::1 > > Doing so otherwise will break when passed to other methods expecting a > correctly formed IPv6 address.
If we change this we also need to change the rules for how "assignment" to $uri->host works. We still want: $uri->host($uri->host) to always be a noop. For historical reasons $uri->host($new) has always picked up the port setting from the string passed in. This is also the documented behaviour. If URI->new("http://[::1:80]")->host returns "::1:80" and ->host_port returns "::1:80:80" then the reverse needs to happen when setting values. Currently setting via host and host_port have behaved exactly the same. I'm in favor of changing $uri->host to return a plain IPv6 address. To allow this we would not allow the port to be picked up when passing an IPv6 address as arg. Since IPv6 addresses always contains at least 2 colons the rule could simply be that the port is only picked up if $new matches =~ /^[^:]*:\d+\z/ or /]:\d\z/. Both $uri->host("[::1]") and $uri->host("::1") should have the same effect and we should allow setting both post and port with $uri- Show quoted text
>host("[::1]:80").
I think it's best to leave $uri->host_port alone and just document that this value isn't always the same as join(":", $uri->host, $uri->port). I would also make setting the values via $uri- Show quoted text
>host_port() stricter to make sure $uri->host_port($uri->host_port) never changes the
$uri->host value.
I'm mostly happy after this patch on top of the previous one.
From ce3c0b1e9e73828fb3052f4107381eb68b1d6ecd Mon Sep 17 00:00:00 2001 From: Gisle Aas <gisle@aas.no> Date: Sat, 24 Oct 2009 00:03:58 +0200 Subject: [PATCH] Added more tests for setting IPv6 addresses using the host method Revert to not stripping the brackets from the return value of $uri->host_port since that gives surprising behavious when trying to set the value. --- URI/_server.pm | 11 +++++++---- t/rfc2732.t | 26 +++++++++++++++++++++----- 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/URI/_server.pm b/URI/_server.pm index beeb9ca..efed014 100644 --- a/URI/_server.pm +++ b/URI/_server.pm @@ -38,7 +38,11 @@ sub host $new = "" unless defined $new; if (length $new) { $new =~ s/[@]/%40/g; # protect @ - $port = $1 if $new =~ s/(:\d+)$//; + if ($new =~ /^[^:]*:\d*\z/ || $new =~ /]:\d*\z/) { + $new =~ s/(:\d*)\z// || die "Assert"; + $port = $1; + } + $new = "[$new]" if $new =~ /:/ && $new !~ /^\[/; # IPv6 address } $self->authority("$ui$new$port"); } @@ -79,9 +83,8 @@ sub host_port $self->host(shift) if @_; return undef unless defined $old; $old =~ s/.*@//; # zap userinfo - $old =~ s/:$//; # empty port does not could - $old =~ s{^\[(.*)\]}{$1}; # strip brackets around IPv6 address - $old .= ":" . $self->port unless $old =~ /:/; + $old =~ s/:$//; # empty port should be treated the same a no port + $old .= ":" . $self->port unless $old =~ /:\d+$/; $old; } diff --git a/t/rfc2732.t b/t/rfc2732.t index 54e07d4..0787e30 100644 --- a/t/rfc2732.t +++ b/t/rfc2732.t @@ -3,22 +3,38 @@ # Test URIs containing IPv6 addresses use strict; -use Test::More tests => 11; +use Test::More tests => 19; use URI; my $uri = URI->new("http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html"); is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html"; - is $uri->host, "FEDC:BA98:7654:3210:FEDC:BA98:7654:3210"; - -is $uri->host_port, "FEDC:BA98:7654:3210:FEDC:BA98:7654:3210:80"; - +is $uri->host_port, "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80"; is $uri->port, "80"; +$uri->port(undef); +is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]/index.html"; +is $uri->host_port, "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80"; +$uri->port(80); + $uri->host("host"); is $uri->as_string, "http://host:80/index.html"; +$uri->host("FEDC:BA98:7654:3210:FEDC:BA98:7654:3210"); +is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html"; +$uri->host_port("[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:88"); +is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:88/index.html"; +$uri->host_port("[::1]:80"); +is $uri->as_string, "http://[::1]:80/index.html"; +$uri->host("::1:80"); +is $uri->as_string, "http://[::1:80]:80/index.html"; +$uri->host("[::1:80]"); +is $uri->as_string, "http://[::1:80]:80/index.html"; +$uri->host("[::1]:88"); +is $uri->as_string, "http://[::1]:88/index.html"; + + $uri = URI->new("ftp://ftp:@[3ffe:2a00:100:7031::1]"); is $uri->as_string, "ftp://ftp:@[3ffe:2a00:100:7031::1]"; -- 1.6.2.95.g934f7
Both these patches are now commited to the repo.
From: andrejohn.mas [...] gmail.com
On Sun Oct 25 06:48:38 2009, GAAS wrote: Show quoted text
> On Thu Mar 20 22:53:45 2008, ajmas wrote:
> > If the URI is of the form: > > > > http://[::1]/ > > > > then the returned host component by URI->host() is: > > > > [::1] > > > > this is incorrect, since the correct return value should be: > > > > ::1 > > > > Doing so otherwise will break when passed to other methods expecting
> a
> > correctly formed IPv6 address.
> > If we change this we also need to change the rules for how > "assignment" to $uri->host > works. We still want: > > $uri->host($uri->host) > > to always be a noop. For historical reasons $uri->host($new) has > always picked up the port > setting from the string passed in. This is also the documented > behaviour. > > If URI->new("http://[::1:80]")->host returns "::1:80" and ->host_port > returns "::1:80:80" > then the reverse needs to happen when setting values. Currently > setting via host and > host_port have behaved exactly the same.
Creating a new URI should be: URI->new("http://[::1]:80") I am curious as to the use of host_port? Though from initial glance it makes sense for host_port to return [::1]:80, while host returns ::1.
On Sun Oct 25 21:29:00 2009, ajmas wrote: Show quoted text
> > If URI->new("http://[::1:80]")->host returns "::1:80" and ->host_port > > returns "::1:80:80" > > then the reverse needs to happen when setting values. Currently > > setting via host and > > host_port have behaved exactly the same.
> > Creating a new URI should be: > > URI->new("http://[::1]:80")
It depends on what address you wanted to capture :-) In the example above the value I was after was actually [::1:80]:80 just to demonstrate that we have a problem. Show quoted text
> I am curious as to the use of host_port?
The reason we have a host_port method is that LWP has various places where it wants to use that as a key into various databases, for instance it uses (host_port, realm) as a key for holding on to username:password combinations.