Skip Menu |

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 84493
Status: open
Priority: 0/
Queue: Encode

People
Owner: Nobody in particular
Requestors: victor [...] vsespb.ru
Cc: SREZIC [...] cpan.org
AdminCc:

Bug Information
Severity: Normal
Broken in: 2.42
Fixed in: (no value)



Subject: Sometimes hang under FreeBSD when encoding layer used and encodings are incorrect.
perl -MEncode -e 'print Encode->VERSION' 2.42_01 The following commands hang with 50% CPU on FreeBSD 9.1 (This is perl 5, version 14, subversion 2 (v5.14.2) built for amd64-freebsd ) perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}\x{43a}")' perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}")' and the following work fine: perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}")' perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess encode("utf-8", "\x{440}\x{43a}")' $ locale LANG= LC_CTYPE="C" LC_COLLATE="C" LC_TIME="C" LC_NUMERIC="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= also, cannot reproduce under linux possibly related to #84282
UPD: perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; croak "\320\220\320\262\321\202\320\276\320\277\320\260\321\200\320\272"' hands too. If I replace 'croak' with die, it works. If I remove "encoding" layer it works too. Also all this PoC works without bugs on FreeBSD 8 with KOI8-R locale. On Mon Apr 08 00:42:59 2013, vsespb wrote: Show quoted text
> perl -MEncode -e 'print Encode->VERSION' > 2.42_01 > > > The following commands hang with 50% CPU on > FreeBSD 9.1 (This is perl 5, version 14, subversion 2 (v5.14.2) built > for amd64-freebsd > ) > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", > "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}\x{43a}")' > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}")' > > > and the following work fine: > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}")' > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{440}\x{43a}")' > > > $ locale > LANG= > LC_CTYPE="C" > LC_COLLATE="C" > LC_TIME="C" > LC_NUMERIC="C" > LC_MONETARY="C" > LC_MESSAGES="C" > LC_ALL= > > also, cannot reproduce under linux > > possibly related to #84282 >
You should not use PerlIO on STDERR because PerlIO modules may carp on STDERR, resulting deep recursion and such. Dan the Maintainer Thereof On Sun Apr 07 16:42:59 2013, vsespb wrote: Show quoted text
> perl -MEncode -e 'print Encode->VERSION' > 2.42_01 > > > The following commands hang with 50% CPU on > FreeBSD 9.1 (This is perl 5, version 14, subversion 2 (v5.14.2) built > for amd64-freebsd > ) > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", > "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}\x{43a}")' > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}")' > > > and the following work fine: > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}")' > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{440}\x{43a}")' > > > $ locale > LANG= > LC_CTYPE="C" > LC_COLLATE="C" > LC_TIME="C" > LC_NUMERIC="C" > LC_MONETARY="C" > LC_MESSAGES="C" > LC_ALL= > > also, cannot reproduce under linux > > possibly related to #84282 >
Hm. Is it documented ? How then stacktrace (croak/carp/die) with non-ASCII data supposed to work ? There are some publications http://www.perl.com/pub/2012/04/perlunicook-decode-standard-filehandles-as-utf-8.html which advice to use PerlIO on stderr. Also there is a pragma 'use open :std' in the CORE. On Mon Apr 08 17:38:20 2013, DANKOGAI wrote: Show quoted text
> You should not use PerlIO on STDERR because PerlIO modules may carp on > STDERR, resulting deep recursion and such. > > Dan the Maintainer Thereof > > On Sun Apr 07 16:42:59 2013, vsespb wrote:
> > perl -MEncode -e 'print Encode->VERSION' > > 2.42_01 > > > > > > The following commands hang with 50% CPU on > > FreeBSD 9.1 (This is perl 5, version 14, subversion 2 (v5.14.2)
> built
> > for amd64-freebsd > > ) > > > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)";
> confess
> > encode("utf-8", > > "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}\x{43a}")' > > > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)";
> confess
> > encode("utf-8",
> "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}")'
> > > > > > and the following work fine: > > > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)";
> confess
> > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}")' > > > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)";
> confess
> > encode("utf-8", "\x{440}\x{43a}")' > > > > > > $ locale > > LANG= > > LC_CTYPE="C" > > LC_COLLATE="C" > > LC_TIME="C" > > LC_NUMERIC="C" > > LC_MONETARY="C" > > LC_MESSAGES="C" > > LC_ALL= > > > > also, cannot reproduce under linux > > > > possibly related to #84282 > >
> >
PerlIO handlers like ':utf8' are okay because they do not modify data. Some are not since they do modify data (and more importantly, its length) ':encoding()' is exactly that. I agree we should document that but it should be more documented in PerlIO part... Dan On Mon Apr 08 09:50:56 2013, vsespb wrote: Show quoted text
> Hm. > > Is it documented ? > > How then stacktrace (croak/carp/die) with non-ASCII data supposed to > work ? > > There are some publications > http://www.perl.com/pub/2012/04/perlunicook-decode-standard- > filehandles-as-utf-8.html which advice to use PerlIO on stderr. > > Also there is a pragma 'use open :std' in the CORE.
Ok, clear. Well, :utf8 can modify data sometimes, I believe(when binary data treated as Latin-1) $perl -e 'binmode STDOUT, ":utf8"; print "\xDC"'|wc 0 1 2 $perl -e ' print "\xDC"'|wc 0 0 1 OK, I will submit perlbug for PerlIO then. On Mon Apr 08 18:03:42 2013, DANKOGAI wrote: Show quoted text
> PerlIO handlers like ':utf8' are okay because they do not modify data. > Some are not since they do modify data (and more importantly, its > length) ':encoding()' is exactly that. > > I agree we should document that but it should be more documented in > PerlIO part... > > Dan > > On Mon Apr 08 09:50:56 2013, vsespb wrote:
> > Hm. > > > > Is it documented ? > > > > How then stacktrace (croak/carp/die) with non-ASCII data supposed to > > work ? > > > > There are some publications > > http://www.perl.com/pub/2012/04/perlunicook-decode-standard- > > filehandles-as-utf-8.html which advice to use PerlIO on stderr. > > > > Also there is a pragma 'use open :std' in the CORE.
From: victor [...] vsespb.ru
Hello. I've submited perlbug https://rt.perl.org/rt3//Public/Bug/Display.html?id=117537 one of possible suggested workaround for now is to use $PerlIO::encoding::fallback = Encode::FB_QUIET On Mon Apr 08 18:03:42 2013, DANKOGAI wrote: Show quoted text
> PerlIO handlers like ':utf8' are okay because they do not modify data. > Some are not since they do modify data (and more importantly, its > length) ':encoding()' is exactly that. > > I agree we should document that but it should be more documented in > PerlIO part... > > Dan > > On Mon Apr 08 09:50:56 2013, vsespb wrote:
> > Hm. > > > > Is it documented ? > > > > How then stacktrace (croak/carp/die) with non-ASCII data supposed to > > work ? > > > > There are some publications > > http://www.perl.com/pub/2012/04/perlunicook-decode-standard- > > filehandles-as-utf-8.html which advice to use PerlIO on stderr. > > > > Also there is a pragma 'use open :std' in the CORE.
On 2013-04-07 16:42:59, vsespb wrote: Show quoted text
> perl -MEncode -e 'print Encode->VERSION' > 2.42_01 > > > The following commands hang with 50% CPU on > FreeBSD 9.1 (This is perl 5, version 14, subversion 2 (v5.14.2) built > for amd64-freebsd > ) > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", > "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}\x{43a}")' > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}\x{440}")' > > > and the following work fine: > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{410}\x{432}\x{442}\x{43e}\x{43f}\x{430}")' > > perl -MEncode -MCarp -e 'binmode STDERR, ":encoding(koi8-r)"; confess > encode("utf-8", "\x{440}\x{43a}")' > > > $ locale > LANG= > LC_CTYPE="C" > LC_COLLATE="C" > LC_TIME="C" > LC_NUMERIC="C" > LC_MONETARY="C" > LC_MESSAGES="C" > LC_ALL= > > also, cannot reproduce under linux > > possibly related to #84282
I have a similar problem (see attached script), but I can reproduce it on a Linux system (Mint 18). Don't make the filename of the test script shorter; this makes the warning message shorter and may not exhibit the problem anymore (or put some more X's into the test string). It does not seem to be related to a particular perl version (I see the problem with 5.12.3 and 5.28.0), but a bisect in Encode shows that the problem started with this commit 31d16654f4cb56f88dc7a01ee1897f4ec8318c1a is the first bad commit commit 31d16654f4cb56f88dc7a01ee1897f4ec8318c1a Author: dankogai <dankogai@d0d07461-0603-4401-acd4-de1884942a52> Date: Fri Dec 31 22:59:40 2010 +0000 VERSION 2.42 Trying the $PerlIO::encoding::fallback = Encode::FB_QUIET; workaround did not work for me. What did work is to replace utf-8 by utf8. Probably a fix is hard, but it would be nice if there was a paragraph about this problem (and the possible workaround using "utf8") in the BUGS or CAVEATS section of encoding.pod.
Subject: broken-unicode-print-hangs-minimal.t
#!/usr/bin/perl use strict; use warnings; use utf8; use POSIX; use Test::More 'no_plan'; my $x = <<'EOF'; XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXШᎾᘅ꜄௟҈ᎾᘅꝄ௟ӨᎾᘅꞄ௟ՈᎾᘅꟄ௟֨Ꮎᘅꡄ௟؈Ꮎᘅ꣄௟٨Ꮎᘅ꤄௟ۈᎾᘅꥄ௟ܨᎾᘅꦄ௟ވᎾpᘧ꧄௟ߨᎾŐᘧꨄ௟ࡈᎾȰᘧꨤ௟ࢨᎾ̐ᘧꩤ௟ईᎾϰᘧꪄ௟२ᎾӐᘧ꫄௟ৈᎾְᘧꬄ௟ਨᎾڐᘧꭄ௟ઈᎾݰᘧꮄ௟૨Ꮎࡐᘧꯄ௟ୈᎾरᘧ간௟நᎾਐᘧ겄௟ఈᎾཚȢ㺨௤ݰȬȢ㶈௤ὐᗤȢ㮨௤懀᫈Ȣ㨨௤懀ᑭȢ㘈௤怀ᑭȢ㨨ȟ汀᫈Ȣ㏈௤柠᫈Ȣ㌈௤映᫈Ȣ糈ଢꨐᎭȢ౨௛ꤰᎭ俄ȟ菈ोᑠॏ侄ȟ耈ोჹ佄ȟ㡈ȟ글ಂ伄ȟ㜨ȟꭠಂ仄ȟ㖨ȟ옠ᙁ亄ȟ㒈ȟ㊠ᙁ乄ȟ㍨ȟ쨐ᙁ丄ȟᶈȟ윀ᙁ䷄ȟ᪈ȟ츀ᙁ䶄ȟᖨȟ촠ᙁ䵄ȟᏈȟ챀ᙁ䴄ȟ綈ȁ鏰ᙀ䳄ȟ质ǽ죀ᑳ䲄ȟﮨ௤退ᚗ䱄ȟ﷨ EOF use Encode; $PerlIO::encoding::fallback = Encode::FB_QUIET; my $sigset = POSIX::SigSet->new(SIGALRM); # normal signal handler cannot be used because of safe signals my $sigaction = POSIX::SigAction->new(sub { die "Timeout" }, $sigset); alarm(3); binmode STDERR, ':encoding(utf-8)'; # does not hang if :encoding(utf8) is used print STDERR "<$x>\n"; alarm(0); pass "Does not hang"; __END__