Skip Menu |

This queue is for tickets about the Net-IDN-Encode CPAN distribution.

Report information
The Basics
Id: 91059
Status: resolved
Worked: 30 min
Priority: 0/
Queue: Net-IDN-Encode

People
Owner: CFAERBER [...] cpan.org
Requestors: dmuey [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 2.100



Subject: case not preserved
Howdy and thank you for Net::IDN::Encode! That case is not preserved may be part of the spec (if so perhaps a note in the POD?) but I noticed that at least one converter does preserve it: e.g. using: I.♥.perl 1. Net::IDN::Encode and charset.org agree when punycoding something: a. perl -MNet::IDN::Encode=:all -le 'print domain_to_ascii("I.\x{2665}.perl");' # I.xn--g6h.perl b. http://www.charset.org/punycode.php?decoded=I.%E2%99%A5.perl&encode=Normal+text+to+Punycode 2. Net::IDN::Encode and charset.org do not agree when un-punycoding something, namely the case difference: a. perl -MNet::IDN::Encode=:all -C -le 'print domain_to_unicode("I.xn--g6h.perl");' # i.♥.perl b. http://www.charset.org/punycode.php?encoded=I.xn--g6h.perl&decode=Punycode+to+normal+text (I.♥.perl) Problem can be seen by noting that in 2.a the I is lowercased but in 2.b the I remains uppercased. So I guess the question is: 2.b or not 2.b (sorry couldn't resist ;p)?
Hi and thank you for your report. I'm not sure whether the IDNA specs mandate any specific behaviour here. The string “I.♥.perl” actually consists of three different labels: “I”, “♥” and “perl” (the string is supposed to be a domain name, after all). While IDNA more or less mandates (or rather: strongly suggests) that all labels are converted to lower case before conversion, the issue at hand here is the handling of labels that don't need any conversion because they are purely ASCII. I agree that it's surprising that the case for ASCII labels is only preserved in domain_to_ascii but not in domain_to_unicode and I'm going to change that behaviour in the next version.
sounds good, thanks for the update!