Subject: | Issue converting a unicode domain to ascii and that same domain from ascii to unicode |
Date: | Tue, 7 Apr 2015 13:35:38 -0600 |
To: | bug-Net-IDN-Encode [...] rt.cpan.org |
From: | Matthew Unwin <matthew.unwin [...] returnpath.com> |
We have a client who has registered the following two domains (these are in
the Tamil language):
ெசாசியதெ-ெஜனரால.com (xn----oweaj2b6a1bms6ihf1ggb.com)
ெசாசியதெ-ெஜனரால.net (xn----oweaj2b6a1bms6ihf1ggb.net)
These domains fail to convert when using Net::IDN::Encode version 2.201 and
perl 5.18 on Centos 6.5.
When I try to convert the two domains above using domain_to_ascii(), I get
the following error:
begins with General_Category=Mark [V5] at
.../lib/perl5/x86_64-linux/Net/IDN/Encode.pm line 46.
The reverse, domain_to_unicode() also fails when testing with the converted
values noted above.
I have tried all combinations of the optional parameters:
AllowUnassigned, UseSTD3ASCIIRules, TransitionalProcessing without success.
I have also tried:
uts46_to_ascii() / uts46_to_unicode -- fails
idna2003_to_ascii() -- succeeds, results in: xn----oweaj2b6a1bms6ihf1ggb
encode_punycode() [tested without the .com and .net] -- succeeds, results
in: --oweaj2b6a1bms6ihf1ggb
I have tried a variety of on-line tools to try and validate that the domain
names are valid:
http://mct.verisign-grs.com/ -- fails
http://㯙㯜㯙㯟.net/ <http://xn--domain.net/> --succeeds (works in both
idna2003 and idna2008 modes and prints out code points)
http://punycode.phlymail.de/ --succeeds (works in both idna2003 and
idna2008 modes)
http://www.motobit.com/util/punycode-decoder-encoder.asp -- succeeds (used
"To IDN")
https://iwantmyname.com/domain-tools/idns/idn-punycode-converter --succeeds
http://www.punycoder.com/ -- succeeds
https://mothereff.in/punycode --succeeds
http://idn-encoding.online-domain-tools.com/ --succeeds
http://www.idnconverter.se/ --succeeds
So, other than Verisign's online tool, I haven't found another unicode to
IDN/punycode converter that has problems converting the two domains above.
This leads me to believe there is a bug somewhere in Net::IDN::Encode.
Thanks!