Bug #30940 for CGI: Needs Documentation: CGI::Util's escape and unescape methods

Fri Nov 23 12:02:29 2007 dsteinbrunner [...] pobox.com - Ticket created

Subject:

Document CGI::Util's escape and unescape methods

CGI::Util's escape and unescape methods are useful outside of just the internals of CGI.pm. Because of this, it would be nice if these methods had exposure in the documentation so that they would be more accessible to developers.

Fri Jul 24 21:32:10 2009 MARKSTOS [...] cpan.org - Subject changed from 'Document CGI::Util's escape and unescape methods' to 'Needs Documentation: CGI::Util's escape and unescape methods'

Fri Jul 24 21:32:50 2009 MARKSTOS [...] cpan.org - Correspondence added

On Fri Nov 23 12:02:29 2007, dsteinbrunner@pobox.com wrote: Show quoted text

> CGI::Util's escape and unescape methods are useful outside of just the > internals of CGI.pm. > Because of this, it would be nice if these methods had exposure in the > documentation so that > they would be more accessible to developers.

I agree. To move this bug report forward, next we'll need a patch which includes the requested documentation.

Fri Jul 24 21:32:50 2009 The RT System itself - Status changed from 'new' to 'open'

Sat Aug 15 16:50:44 2009 MARKSTOS [...] cpan.org - Correspondence added

In theory, I think this documentation should be able to be taken almost verbatim from URI::Escape: http://search.cpan.org/~gaas/URI-1.40/URI/Escape.pm In reality, I think there are a number of subtle but important differences between the approaches, and I'm no encoding expert to understand them. I tried porting a few of the tests from URI::Escape to CGI.pm to see if they passed. Here are the first few: use Test::More 'no_plan'; use CGI::Util qw(escape unescape); use utf8; is(escape(undef), undef); is(escape("|abcå"), "%7Cabc%E5"); is(unescape("%7Cabc%e5"),"|abcå"); The first one, the "undef" case, always passes. The second and third fail if the "use utf8;" line is commented out, and the second test fails even with "use utf8;" turned on. I don't know if that's a bug, or an intended difference in how they are implemented.

Sat Jul 03 04:14:31 2010 peter [...] morch.com - Ticket #59077: Ticket created

perlfaq9 has this question: "How do I decode or create those %-encodings on the web?", and the answer begins: Show quoted text

> If you are writing a CGI script, you should be using the CGI.pm module > that comes with perl, or some other equivalent module.

And yes, CGI imports escape and unescape from CGI::Utils but perldoc CGI::Utils says: Show quoted text

> DESCRIPTION: no public subroutines

So I guess using CGI::escape and CGI::unescape amounts to using undocumented features of CGI. In light of that, I suggest the answer to the FAQ to be misleading. At least I have no idea how to "decode or create those %-encodings on the web" from reading perldoc CGI. So therefore I suggest documenting escape and unescape in perldoc CGI. I've created a simple patch for that (attached and at http://pastebin.com/bPRq0Nsj). I hope this can be included in future CGI releases. In any event, I find the FAQ answer is not currently useful as far as CGI goes. This post was made after consulting perl.perlfaq.workers. See e.g. http://groups.google.com/group/perl.perlfaq.workers/browse_thread/thread/8f6d04a0ae6f5f4f/ (Also, perldoc CGI has: Show quoted text

> Address bug reports and comments to: lstein@cshl.org.

If that is no longer appropriate, perhaps CGI.pm should change on that account too. Feel free to let me know if you want me to open a separate ticket for that.)

Subject:

CGI-document-un_escape.patch

--- orig.CGI.pm 2010-01-29 15:41:54.000000000 +0100 +++ CGI.pm 2010-06-28 08:56:49.000000000 +0200 @@ -255,6 +255,7 @@ ':html' => [qw/:html2 :html3 :html4 :netscape/], ':standard' => [qw/:html2 :html3 :html4 :form :cgi/], ':push' => [qw/multipart_init multipart_start multipart_end multipart_final/], + ':escape' => [qw/escape unescape/], ':all' => [qw/:html2 :html3 :netscape :form :cgi :internal :html4/] ); @@ -4878,6 +4879,10 @@ Import all HTML-generating shortcuts (i.e. 'html2', 'html3', 'html4' and 'netscape') +=item B<:escape> + +Import the escape and unescape methods + =item B<:standard> Import "standard" features, 'html2', 'html3', 'html4', 'form' and 'cgi'. @@ -5698,6 +5703,20 @@ ), hr; +=head2 ESCAPING/ENCODING AND UNESCAPING/DECODING URL STRINGS + +URL parameters are encoded in what is also known as Percent-encoding +L<http://en.wikipedia.org/wiki/Percent-encoding> + + use CGI ':escape'; + my $string = "Hello World"; + my $escaped = escape($string); + print $escaped, "\n"; + die "How could unescape(escape($string)) ne $string" + unless unescape($escaped) eq $string; + +prints out "Hello%20World" + =head2 PROVIDING ARGUMENTS TO HTML SHORTCUTS The HTML methods will accept zero, one or multiple arguments. If you

Tue Jul 06 04:21:32 2010 mark [...] summersault.com - Ticket #59077: Correspondence added

Subject:	Re: [rt.cpan.org #59077] escape / unescape docs
Date:	Mon, 5 Jul 2010 09:30:08 -0400
To:	bug-CGI [...] rt.cpan.org
From:	Mark Stosberg <mark [...] summersault.com>

Thanks for the contribution! I agree that "escape" and "unescape" should be documented. I would like to see thee docs in the main CGI.pm documentation along with nearly everything else. I thought there was already existing bug report about this. The bug tracker appears to be down for a bit so I can't check on it now. Mark

Tue Jul 06 04:21:34 2010 The RT System itself - Ticket #59077: Status changed from 'new' to 'open'

Tue Jul 06 16:41:32 2010 MARKSTOS [...] cpan.org - Ticket #59077: Subject changed from (no value) to 'escape() and unescape() should be documented'

Tue Jul 06 16:41:33 2010 MARKSTOS [...] cpan.org - Ticket #59077: Broken in 3.49 added

Tue Jul 06 16:42:45 2010 MARKSTOS [...] cpan.org - Ticket #59077: Merged into ticket #30940

Tue Jul 06 16:42:45 2010 MARKSTOS [...] cpan.org - Merged into ticket #30940

Wed Jul 07 09:39:06 2010 MARKSTOS [...] cpan.org - Correspondence added

Peter, Would you care to comment on how we might account for the differences between CGI's implementation and that of URI::Escape? We could use some help determining if the difference implies that one has a bug with regard to the encoding spec or UTF-8 handling.

Thu May 22 08:03:04 2014 LEEJO [...] cpan.org - Correspondence added

This issue has been copied to: https://github.com/leejo/CGI.pm/issues/52 please take all future correspondence there. This ticket will remain open but please do not reply here. This ticket will be closed when the github issue is dealt with.

Thu May 22 11:55:57 2014 LEEJO [...] cpan.org - Correspondence added

perlfaq9 now mentions the use of URI::Escape, this entire bug can be closed. The answer could be: Use URI::Escape instead. (and I'm ignoring the UTF-8 issues here).

Fri May 23 14:28:02 2014 The RT System itself - Queue changed from CGI.pm to CGI

Sat May 24 04:12:35 2014 LEEJO [...] cpan.org - Correspondence added

Resolved as per previous comment

Sat May 24 04:12:36 2014 LEEJO [...] cpan.org - Status changed from 'open' to 'resolved'

Bug #30940 for CGI: Needs Documentation: CGI::Util's escape and unescape methods

Preferred bug tracker