Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the CGI CPAN distribution.

Report information
The Basics
Id: 32122
Status: resolved
Priority: 0/
Queue: CGI

People
Owner: LDS [...] cpan.org
Requestors: dietrich.streifert [...] googlemail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: Changes to CGI::Util method escape breaks compatibility to CGI::Compress::Gzip
We are using CGI and CGI::Compress::Gzip to automatically compress output of html (Indirectly by using CGI::Application and CGI::Application::Plugin::CompressGzip). After updating to from CGI V 3.29 to V 3.33 the output seemed to be created as UTF-8 independent from the charset settings in the file or header. This was for pages adding cookies to the header. After examining the code it turned out that changes in the method escape from $toencode = eval { pack("C*", unpack("U0C*", $toencode))} || pack("C*", unpack("C*", $toencode)); to (change from "C*" to "U*" in the first pack call) $toencode = eval { pack("U*", unpack("U0C*", $toencode))} || pack("C*", unpack("C*", $toencode)); Caused the problem. I don't know if this problem can be solved in this module but it caused the problem. I'll report this also to the CGI::Compress::Gzip module RT. Thank you for your help and your great module. Happy new year.
Ouch! Can you give me any more detail on why the Gzip compression is not working? I don't see an obvious dependency between the charset and the gzip module. Lincoln On Mon Jan 07 04:17:23 2008, level420 wrote: Show quoted text
> We are using CGI and CGI::Compress::Gzip to automatically compress > output of html (Indirectly by using CGI::Application and > CGI::Application::Plugin::CompressGzip). > > After updating to from CGI V 3.29 to V 3.33 the output seemed to be > created as UTF-8 independent from the charset settings in the file or > header. This was for pages adding cookies to the header. > > After examining the code it turned out that changes in the method escape > from > > $toencode = eval { pack("C*", unpack("U0C*", $toencode))} || pack("C*", > unpack("C*", $toencode)); > > to (change from "C*" to "U*" in the first pack call) > > $toencode = eval { pack("U*", unpack("U0C*", $toencode))} || pack("C*", > unpack("C*", $toencode)); > > Caused the problem. > > I don't know if this problem can be solved in this module but it caused > the problem. I'll report this also to the CGI::Compress::Gzip module RT. > > Thank you for your help and your great module. Happy new year.
From: dietrich.streifert [...] googlemail.com
On Mo. 07. Jan. 2008, 10:06:34, LDS wrote: Show quoted text
> Ouch! Can you give me any more detail on why the Gzip compression is not > working? I don't see an obvious dependency between the charset and the > gzip module. > > Lincoln >
Thank you for your quick answer Lincoln! Sorry but I can't give you much information why this is happening. I found that with CPR.pm 3.29 everything worked and after an upgrade to CPR.pm 3.33 the problem was there. My pages are encoded in ISO-8859-1 and the accented characters are inserted as is and not as entities. So on the first page visit (tested in IE7 and FF2) instead of the accented characters the typical two byte encoding hyroglyphs started showing up in the page. But only when I was setting a cookie! So I investigated some time in testing other CPR.pm versions and having a look into the cookie handling and found that the escape method from CGI::Util was used to encode the cookie data. After detecting the changes I simply reverted the out-commented pack/unpack line and commented the new one. And voila! It worked again. Maybe CGI::Compress::Gzip has to be updated to work again with the new escape implementation but I have to few knowledge to do this. So I ended up just informing you that there is a problem. So I'm totaly depending here on your help.
From: dietrich.streifert [...] googlemail.com
Any news on this subject? Regards...
Try applying this patch.
? CGI-diff ? CGI.diff ? CGI.patch ? CGI.pm-3.29.tar.gz ? CGI.pm-3.30.tar.gz ? CGI.pm-3.31.tar.gz ? CGI.pm-3.32.tar.gz ? CGI.pm-3.33.tar.gz ? CGI.pm-3.34.tar.gz ? CGI.pm.diff ? Carp.pm.patch ? META.yml ? PUT.patch ? TODO ? attributes.patch ? backout.patch ? d.txt ? post_max_bug.txt ? proposed_diff.patch ? tar.gz ? t/.cvsignore ? t/uploadInfo.t Index: CGI.pm =================================================================== RCS file: /usr/local/cvs_repository/CGI.pm/CGI.pm,v retrieving revision 1.242 retrieving revision 1.247 diff -u -r1.242 -r1.247 --- CGI.pm 27 Dec 2007 18:39:38 -0000 1.242 +++ CGI.pm 14 Mar 2008 14:29:36 -0000 1.247 @@ -18,8 +18,8 @@ # The most recent version and complete docs are available at: # http://stein.cshl.org/WWW/software/CGI/ -$CGI::revision = '$Id: CGI.pm,v 1.242 2007/12/27 18:39:38 lstein Exp $'; -$CGI::VERSION='3.32'; +$CGI::revision = '$Id: CGI.pm,v 1.247 2008/03/14 14:29:36 lstein Exp $'; +$CGI::VERSION='3.34'; # HARD-CODED LOCATION FOR FILE UPLOAD TEMPORARY FILES. # UNCOMMENT THIS ONLY IF YOU KNOW WHAT YOU'RE DOING. @@ -1835,7 +1835,7 @@ my($method,$action,$enctype,@other) = rearrange([METHOD,ACTION,ENCTYPE],@p); - $method = $self->escapeHTML(lc($method) || 'post'); + $method = $self->escapeHTML(lc($method || 'post')); $enctype = $self->escapeHTML($enctype || &URL_ENCODED); if (defined $action) { $action = $self->escapeHTML($action); @@ -2198,9 +2198,11 @@ else { $toencode =~ s{"}{&quot;}gso; } - my $latin = uc $self->{'.charset'} eq 'ISO-8859-1' || - uc $self->{'.charset'} eq 'WINDOWS-1252'; - if ($latin) { # bug in some browsers + # Handle bug in some browsers with Latin charsets + if ($self->{'.charset'} && + (uc($self->{'.charset'}) eq 'ISO-8859-1' || + uc($self->{'.charset'}) eq 'WINDOWS-1252')) + { $toencode =~ s{'}{&#39;}gso; $toencode =~ s{\x8b}{&#8249;}gso; $toencode =~ s{\x9b}{&#8250;}gso; @@ -2730,6 +2732,7 @@ $url .= $path if $path_info and defined $path; $url .= "?$query_str" if $query and $query_str ne ''; + $url ||= ''; $url =~ s/([^a-zA-Z0-9_.%;&?\/\\:+=~-])/sprintf("%%%02X",ord($1))/eg; return $url; } @@ -4039,7 +4042,7 @@ my $filename; find_tempdir() unless -w $TMPDIRECTORY; for (my $i = 0; $i < $MAXTRIES; $i++) { - last if ! -f ($filename = sprintf("${TMPDIRECTORY}${SL}CGItemp%d",$sequence++)); + last if ! -f ($filename = sprintf("\%s${SL}CGItemp%d", $TMPDIRECTORY, $sequence++)); } # check that it is a more-or-less valid filename return unless $filename =~ m!^([a-zA-Z0-9_\+ \'\":/.\$\\-]+)$!; @@ -7685,10 +7688,8 @@ =head1 AUTHOR INFORMATION -Copyright 1995-1998, Lincoln D. Stein. All rights reserved. - -This library is free software; you can redistribute it and/or modify -it under the same terms as Perl itself. +The GD.pm interface is copyright 1995-2007, Lincoln D. Stein. It is +distributed under GPL and the Artistic License 2.0. Address bug reports and comments to: lstein@cshl.org. When sending bug reports, please provide the version of CGI.pm, the version of Index: Changes =================================================================== RCS file: /usr/local/cvs_repository/CGI.pm/Changes,v retrieving revision 1.64 retrieving revision 1.68 diff -u -r1.64 -r1.68 --- Changes 27 Dec 2007 18:39:38 -0000 1.64 +++ Changes 14 Mar 2008 14:29:36 -0000 1.68 @@ -1,3 +1,11 @@ + Version 3.34 + 1. Handle Unicode %uXXXX escapes properly -- patch from DANKOGAI@cpan.org + + Version 3.33 + 1. Remove uninit variable warning when calling url(-relative=>1) + 2. Fix uninit variable warnings for two lc calls + 3. Fixed failure of tempfile upload due to sprintf() taint failure in perl 5.10 + Version 3.32 1. Patch from Miguel Santinho to prevent sending premature headers under mod_perl 2.0 Index: CGI/Util.pm =================================================================== RCS file: /usr/local/cvs_repository/CGI.pm/CGI/Util.pm,v retrieving revision 1.26 retrieving revision 1.27 diff -u -r1.26 -r1.27 --- CGI/Util.pm 30 Nov 2007 19:04:04 -0000 1.26 +++ CGI/Util.pm 14 Mar 2008 14:29:37 -0000 1.27 @@ -7,7 +7,7 @@ @EXPORT_OK = qw(rearrange make_attributes unescape escape expires ebcdic2ascii ascii2ebcdic); -$VERSION = '1.5'; +$VERSION = '1.5_01'; $EBCDIC = "\t" ne "\011"; # (ord('^') == 95) for codepage 1047 as on os390, vmesa @@ -141,8 +141,12 @@ sub utf8_chr { my $c = shift(@_); - return chr($c) if $] >= 5.006; - + if ($] >= 5.006){ + require utf8; + my $u = chr($c); + utf8::encode($u); # drop utf8 flag + return $u; + } if ($c < 0x80) { return sprintf("%c", $c); } elsif ($c < 0x800) { @@ -189,6 +193,17 @@ if ($EBCDIC) { $todecode =~ s/%([0-9a-fA-F]{2})/chr $A2E[hex($1)]/ge; } else { + # handle surrogate pairs first -- dankogai + $todecode =~ s{ + %u([Dd][89a-bA-B][0-9a-fA-F]{2}) # hi + %u([Dd][c-fC-F][0-9a-fA-F]{2}) # lo + }{ + utf8_chr( + 0x10000 + + (hex($1) - 0xD800) * 0x400 + + (hex($2) - 0xDC00) + ) + }gex; $todecode =~ s/%(?:([0-9a-fA-F]{2})|u([0-9a-fA-F]{4}))/ defined($1)? chr hex($1) : utf8_chr(hex($2))/ge; } @@ -200,9 +215,12 @@ shift() if @_ > 1 and ( ref($_[0]) || (defined $_[1] && $_[0] eq $CGI::DefaultClass)); my $toencode = shift; return undef unless defined($toencode); + $toencode = eval { pack("C*", unpack("U0C*", $toencode))} || pack("C*", unpack("C*", $toencode)); + # force bytes while preserving backward compatibility -- dankogai -# $toencode = eval { pack("C*", unpack("U0C*", $toencode))} || pack("C*", unpack("C*", $toencode)); - $toencode = eval { pack("U*", unpack("U0C*", $toencode))} || pack("C*", unpack("C*", $toencode)); + # but commented out because it was breaking CGI::Compress -- lstein + # $toencode = eval { pack("U*", unpack("U0C*", $toencode))} || pack("C*", unpack("C*", $toencode)); + if ($EBCDIC) { $toencode=~s/([^a-zA-Z0-9_.~-])/uc sprintf("%%%02x",$E2A[ord($1)])/eg; } else {
From: dietrich.streifert [...] googlemail.com
On Fr. 14. Mär. 2008, 10:33:15, LDS wrote: Show quoted text
> Try applying this patch.
Yes! This solves the bug. I stumbled first on the fact that the patch is for CGI 3.32, but after patching the right version the bug is solved. Please publish a new version ASAP. Thank you for your support. Best regards.
Fixed in 3.34.