Skip Menu |

This queue is for tickets about the Net-IDN-Encode CPAN distribution.

Report information
The Basics
Id: 127056
Status: patched
Priority: 0/
Queue: Net-IDN-Encode

People
Owner: CFAERBER [...] cpan.org
Requestors: shin [...] kojima.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: Shin Kojima <shin [...] kojima.org>
Subject: [PATCH 1/2] Fix domain_to_ascii AllowUnassigned param
Date: Fri, 7 Sep 2018 11:58:02 +0900
To: bug-Net-IDN-Encode [...] rt.cpan.org
From: Shin Kojima <shin [...] kojima.org>
While using old perl (e.g. v5.8.8), `Net::IDN::Encode::domain_to_ascii` can't handle Emoji domains even if `AllowUnassigned` param is on. print domain_to_ascii("\x{1f985}.example", AllowUnassigned => 1); Above code gave me an error as following: disallowed character U+1F985 But when I add some ascii characters along with the emoji, it convert as I expected: print domain_to_ascii("AAAAA\x{1f985}.example", AllowUnassigned => 1); Result: xn--aaaaa-ki74d.example The original code has an extra closing bracket, and it seems something is going very wrong here for me. m/^(\P{IsDisallowed}}|\P{Assigned})*$/ ^ here https://en.wikipedia.org/wiki/Emoji_domain Signed-off-by: Shin Kojima <shin@kojima.org> --- lib/Net/IDN/UTS46.pm | 7 +++---- t/domain_to_ascii.t | 4 +++- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/lib/Net/IDN/UTS46.pm b/lib/Net/IDN/UTS46.pm index 23a363e..11f5702 100644 --- a/lib/Net/IDN/UTS46.pm +++ b/lib/Net/IDN/UTS46.pm @@ -62,10 +62,9 @@ sub _process { # 1. Map # - disallowed # - if($param{'AllowUnassigned'}) { - $label =~ m/^(\P{IsDisallowed}}|\P{Assigned})*$/ and croak sprintf('disallowed character U+%04X', ord($1)); - } else { - $label =~ m/(\p{IsDisallowed})/ and croak sprintf('disallowed character U+%04X', ord($1)); + $label =~ m/(\p{IsDisallowed})/ and croak sprintf('disallowed character U+%04X', ord($1)); + + unless($param{'AllowUnassigned'}) { $label =~ m/(\P{Assigned})/ and croak sprintf('unassigned character U+%04X (in this version of perl)', ord($1)); } diff --git a/t/domain_to_ascii.t b/t/domain_to_ascii.t index 90a025f..747bddd 100755 --- a/t/domain_to_ascii.t +++ b/t/domain_to_ascii.t @@ -5,7 +5,7 @@ BEGIN { binmode STDOUT, ':utf8'; binmode STDERR, ':utf8'; } use Net::IDN::Encode qw(:all); -use Test::More tests => 1 + 12; +use Test::More tests => 1 + 13; use Test::NoWarnings; use Net::IDN::Encode qw(:all); @@ -16,6 +16,8 @@ is(eval{domain_to_ascii('XN--MLLER-KVA')} || $@, 'XN--MLLER-KVA', 'single upperc is(eval{domain_to_ascii('www.jürg.xn--mller-kva.com', )} || $@, 'www.xn--jrg-hoa.xn--mller-kva.com', 'mixed utf8/ace/ascii'); is(eval{domain_to_ascii('www.a.b。c.d。com', )} || $@, 'www.a.b.c.d.com', 'mixed dots'); +is(eval{domain_to_ascii("www.\x{1F985}.example", AllowUnassigned => 1)} || $@, 'www.xn--4s9h.example', 'Unicode 9.0 emoji'); + is(eval{domain_to_ascii('www.ä ö ü ß.example', 'UseSTD3ASCIIRules' => 0)}, 'www.xn-- -7kav3ivb.example', 'blank (without STD3 rules) (to_unicode)') or diag $@; is(eval{domain_to_ascii('www.ä ö ü ß.example', 'UseSTD3ASCIIRules' => 1)}, undef, 'blank (with STD3 rules) (to_unicode)') or diag $@; is(eval{domain_to_ascii('www.xn-- -7kav3ivb.example', 'UseSTD3ASCIIRules' => 0)}, 'www.xn-- -7kav3ivb.example', 'blank (without STD3 rules) (to_unicode pass-through)') or diag $@; -- 2.18.0
Thank you for reporting. I've just uploaded a development version that fixes this and will release it as 2.401 once the CPAN testers results are fine.