Skip Menu |

This queue is for tickets about the URI CPAN distribution.

Report information
The Basics
Id: 24934
Status: new
Priority: 0/
Queue: URI

People
Owner: Nobody in particular
Requestors: paulo.matos [...] fct.unl.pt
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



CC: paulo.matos [...] fct.unl.pt
Subject: URI mailto not correctly encoded according to rfc2368
Date: Tue, 13 Feb 2007 21:15:15 +0000 (WET)
To: bug-URI [...] rt.cpan.org
From: Paulo Matos <paulo.matos [...] fct.unl.pt>
RFC2368 (http://www.ietf.org/rfc/rfc2368), number 2, says: " (...) 8-bit characters in mailto URLs are forbidden. MIME encoded words (as defined in [RFC2047]) are permitted in header values, but not for any part of a "body" hname. " When using headers with non-ascii characters, e.g.: To: João Góis <joao.gois@example.com> URI behaves like: # perl -MURI -e '$u=URI->new("João Góis <joao.gois\@example.com>", "mailto"); print $u->as_string."\n";' mailto:Jo%E3o%20G%F3is%20%3Cjoao.gois@example.com%3E This is "URL-encoded" (aka "%-encoded") which is correct for HTML interpretation, but according to what is stated on rfc2368 it should be first MIME encoded, and if needed URL-encoded afterwards. And why? Because you loose charset information! %-encoding will probably work when charset information is coherent. I also noticed that "," is not encoded as %2C, but this seems to be only a suggestion not something mandatory. Regards, -- Paulo Matos