Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the PPI CPAN distribution.

Report information
The Basics
Id: 35917
Status: resolved
Priority: 0/
Queue: PPI

People
Owner: Nobody in particular
Requestors: claco [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: charsets.t eats all available VM
t/14/charsets.t eats VM memory until an out of memory error is reached. t/14_charsets......................perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) perl(797) malloc: *** error: can't allocate region perl(797) malloc: *** set a breakpoint in szone_error to debug Out of memory! perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) perl(797) malloc: *** error: can't allocate region perl(797) malloc: *** set a breakpoint in szone_error to debug Out of memory! perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) perl(797) malloc: *** error: can't allocate region perl(797) malloc: *** set a breakpoint in szone_error to debug Out of memory! Callback called exit. END failed--call queue aborted. The machine has 4GB of memory installed. Here is the output from running chartset.t directly: mbp:~/.cpan/build/PPI-1.203-s5lYEU claco$ perl -Ilib t/14_charsets.t 1..11 ok 1 - Parsed code without accented chars ok 2 - Function with umlaut Here's my perl -V -- claco@mbp ~ $ perl -V Summary of my perl5 (revision 5 version 8 subversion 6) configuration: Platform: osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level uname='darwin b48.apple.com 8.0 darwin kernel version 8.3.0: mon oct 3 20:04:04 pdt 2005; root:xnu-792.6.22.obj~2release_ppc power macintosh powerpc ' config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe -Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include', optimize='-O3', cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5363) (+4864187)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/local/lib' libpth=/usr/local/lib /usr/lib libs=-ldbm -ldl -lm -lc perllibs=-ldl -lm -lc libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, libperl=libperl.dylib gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Locally applied patches: 23953 - fix for File::Path::rmtree CAN-2004-0452 security issue 33990 - fix for setuid perl security issues fix for regcomp CVE-2007-5116 security vulnerability SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962 Built under darwin Compiled at Nov 26 2007 09:16:22 %ENV: PERL5LIB="/sw/lib/perl5:/sw/lib/perl5/darwin" @INC: /sw/lib/perl5/5.8.6/darwin-thread-multi-2level /sw/lib/perl5/5.8.6 /sw/lib/perl5/darwin-thread-multi-2level /sw/lib/perl5 /sw/lib/perl5/darwin /System/Library/Perl/5.8.6/darwin-thread-multi-2level /System/Library/Perl/5.8.6 /Library/Perl/5.8.6/darwin-thread-multi-2level /Library/Perl/5.8.6 /Library/Perl /Network/Library/Perl/5.8.6/darwin-thread-multi-2level /Network/Library/Perl/5.8.6 /Network/Library/Perl /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level /System/Library/Perl/Extras/5.8.6 /Library/Perl/5.8.1
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Thu, 15 May 2008 20:46:55 -0500
To: bug-PPI [...] rt.cpan.org
From: Chris Dolan <chris [...] chrisdolan.net>
Hi Chris, Reproduced on my Mac. I'm investigating... Chris On May 15, 2008, at 5:58 PM, Christopher H. Laco via RT wrote: Show quoted text
> > Thu May 15 18:58:45 2008: Request 35917 was acted upon. > Transaction: Ticket created by CLACO > Queue: PPI > Subject: charsets.t eats all available VM > Broken in: (no value) > Severity: (no value) > Owner: Nobody > Requestors: claco@cpan.org > Status: new > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > > > > t/14/charsets.t eats VM memory until an out of memory error is > reached. > > > t/14_charsets......................perl(797) malloc: *** > vm_allocate(size=8421376) failed (error code=3) > perl(797) malloc: *** error: can't allocate region > perl(797) malloc: *** set a breakpoint in szone_error to debug > Out of memory! > perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) > perl(797) malloc: *** error: can't allocate region > perl(797) malloc: *** set a breakpoint in szone_error to debug > Out of memory! > perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) > perl(797) malloc: *** error: can't allocate region > perl(797) malloc: *** set a breakpoint in szone_error to debug > Out of memory! > Callback called exit. > END failed--call queue aborted. > > The machine has 4GB of memory installed. > Here is the output from running chartset.t directly: > > mbp:~/.cpan/build/PPI-1.203-s5lYEU claco$ perl -Ilib t/14_charsets.t > 1..11 > ok 1 - Parsed code without accented chars > ok 2 - Function with umlaut > > > > Here's my perl -V > -- > claco@mbp ~ $ perl -V > Summary of my perl5 (revision 5 version 8 subversion 6) configuration: > Platform: > osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level > uname='darwin b48.apple.com 8.0 darwin kernel version 8.3.0: > mon oct > 3 20:04:04 pdt 2005; root:xnu-792.6.22.obj~2release_ppc power > macintosh > powerpc ' > config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe > -Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN > -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include', > optimize='-O3', > cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN > -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include' > ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5363) > (+4864187)', gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=16 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', > lseeksize=8 > alignbytes=8, prototype=define > Linker and Libraries: > ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/ > local/lib' > libpth=/usr/local/lib /usr/lib > libs=-ldbm -ldl -lm -lc > perllibs=-ldl -lm -lc > libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, > libperl=libperl.dylib > gnulibc_version='' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' > cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup > -L/usr/local/lib' > > > Characteristics of this binary (from libperl): > Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES > PERL_IMPLICIT_CONTEXT > Locally applied patches: > 23953 - fix for File::Path::rmtree CAN-2004-0452 security > issue > 33990 - fix for setuid perl security issues > fix for regcomp CVE-2007-5116 security vulnerability > SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962 > Built under darwin > Compiled at Nov 26 2007 09:16:22 > %ENV: > PERL5LIB="/sw/lib/perl5:/sw/lib/perl5/darwin" > @INC: > /sw/lib/perl5/5.8.6/darwin-thread-multi-2level > /sw/lib/perl5/5.8.6 > /sw/lib/perl5/darwin-thread-multi-2level > /sw/lib/perl5 > /sw/lib/perl5/darwin > /System/Library/Perl/5.8.6/darwin-thread-multi-2level > /System/Library/Perl/5.8.6 > /Library/Perl/5.8.6/darwin-thread-multi-2level > /Library/Perl/5.8.6 > /Library/Perl > /Network/Library/Perl/5.8.6/darwin-thread-multi-2level > /Network/Library/Perl/5.8.6 > /Network/Library/Perl > /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level > /System/Library/Perl/Extras/5.8.6 > /Library/Perl/5.8.1 >
CC: bug-PPI [...] rt.cpan.org
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Thu, 15 May 2008 21:43:23 -0500
To: Chris Dolan <chris [...] chrisdolan.net>
From: Chris Dolan <chris [...] chrisdolan.net>
OK, this is perhaps one for Adam to tackle. I've found the triggering symptom, but not the cause. This is a very interesting case. We start with 14_charsets.t good_ok( '一();', "Function with Chinese characters" ); which at the very end of tokenization looks like this (right after "the perfect crime!": $self = PPI::Tokenizer=HASH(0x394c4) 'class' => 'PPI::Token::Whitespace' 'line' => '\x{4E00}(); ' 'line_count' => 1 'line_cursor' => 5 'line_length' => 5 'source' => ARRAY(0x1ab72d8) empty array 'source_bytes' => 4 'source_eof_chop' => '' 'token' => undef 'token_cursor' => 1 'token_eof' => 0 'tokens' => ARRAY(0x392b4) 0 PPI::Token::Word=HASH(0x1aefae0) 'content' => '\x{4E00}' 1 PPI::Token::Structure=HASH(0x1afb724) 'content' => '(); ' 2 PPI::Token::Structure=HASH(0x1afb7e4) 'content' => ')' 3 PPI::Token::Structure=HASH(0x1afb824) 'content' => ';' 'v6' => ARRAY(0x3a274) empty array 'zone' => 'PPI::Token::Whitespace' The key point is that the content field of tokens[1] is '(); ' when it should be '('. As consequence, that token answers true to both __LEXER__opens and __LEXER__closes, which triggers an infinite loop as the lexer repeatedly calls rollback() and pushes more and more empty statements onto the Document instance, thus using all RAM. So, the question is why does the PPI::Token::Structure go awry and pull in too many characters? I've only tested on Mac OS X 10.4 (the same as claco) so maybe it's specific to 5.8.6. Dunno yet, input appreciated. Chris On May 15, 2008, at 8:46 PM, Chris Dolan wrote: Show quoted text
> Hi Chris, > > Reproduced on my Mac. I'm investigating... > > Chris > > On May 15, 2008, at 5:58 PM, Christopher H. Laco via RT wrote:
>> >> Thu May 15 18:58:45 2008: Request 35917 was acted upon. >> Transaction: Ticket created by CLACO >> Queue: PPI >> Subject: charsets.t eats all available VM >> Broken in: (no value) >> Severity: (no value) >> Owner: Nobody >> Requestors: claco@cpan.org >> Status: new >> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >> >> >> t/14/charsets.t eats VM memory until an out of memory error is >> reached. >> >> >> t/14_charsets......................perl(797) malloc: *** >> vm_allocate(size=8421376) failed (error code=3) >> perl(797) malloc: *** error: can't allocate region >> perl(797) malloc: *** set a breakpoint in szone_error to debug >> Out of memory! >> perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) >> perl(797) malloc: *** error: can't allocate region >> perl(797) malloc: *** set a breakpoint in szone_error to debug >> Out of memory! >> perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) >> perl(797) malloc: *** error: can't allocate region >> perl(797) malloc: *** set a breakpoint in szone_error to debug >> Out of memory! >> Callback called exit. >> END failed--call queue aborted. >> >> The machine has 4GB of memory installed. >> Here is the output from running chartset.t directly: >> >> mbp:~/.cpan/build/PPI-1.203-s5lYEU claco$ perl -Ilib t/14_charsets.t >> 1..11 >> ok 1 - Parsed code without accented chars >> ok 2 - Function with umlaut >> >> >> >> Here's my perl -V >> -- >> claco@mbp ~ $ perl -V >> Summary of my perl5 (revision 5 version 8 subversion 6) >> configuration: >> Platform: >> osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level >> uname='darwin b48.apple.com 8.0 darwin kernel version 8.3.0: >> mon oct >> 3 20:04:04 pdt 2005; root:xnu-792.6.22.obj~2release_ppc power >> macintosh >> powerpc ' >> config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe >> -Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib' >> hint=recommended, useposix=true, d_sigaction=define >> usethreads=define use5005threads=undef useithreads=define >> usemultiplicity=define >> useperlio=define d_sfio=undef uselargefiles=define usesocks=undef >> use64bitint=undef use64bitall=undef uselongdouble=undef >> usemymalloc=n, bincompat5005=undef >> Compiler: >> cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN >> -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include', >> optimize='-O3', >> cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN >> -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include' >> ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5363) >> (+4864187)', gccosandvers='' >> intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 >> d_longlong=define, longlongsize=8, d_longdbl=define, >> longdblsize=16 >> ivtype='long', ivsize=4, nvtype='double', nvsize=8, >> Off_t='off_t', >> lseeksize=8 >> alignbytes=8, prototype=define >> Linker and Libraries: >> ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/ >> local/lib' >> libpth=/usr/local/lib /usr/lib >> libs=-ldbm -ldl -lm -lc >> perllibs=-ldl -lm -lc >> libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, >> libperl=libperl.dylib >> gnulibc_version='' >> Dynamic Linking: >> dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' >> cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup >> -L/usr/local/lib' >> >> >> Characteristics of this binary (from libperl): >> Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES >> PERL_IMPLICIT_CONTEXT >> Locally applied patches: >> 23953 - fix for File::Path::rmtree CAN-2004-0452 security >> issue >> 33990 - fix for setuid perl security issues >> fix for regcomp CVE-2007-5116 security vulnerability >> SPRINTF0 - fixes for sprintf formatting issues - >> CVE-2005-3962 >> Built under darwin >> Compiled at Nov 26 2007 09:16:22 >> %ENV: >> PERL5LIB="/sw/lib/perl5:/sw/lib/perl5/darwin" >> @INC: >> /sw/lib/perl5/5.8.6/darwin-thread-multi-2level >> /sw/lib/perl5/5.8.6 >> /sw/lib/perl5/darwin-thread-multi-2level >> /sw/lib/perl5 >> /sw/lib/perl5/darwin >> /System/Library/Perl/5.8.6/darwin-thread-multi-2level >> /System/Library/Perl/5.8.6 >> /Library/Perl/5.8.6/darwin-thread-multi-2level >> /Library/Perl/5.8.6 >> /Library/Perl >> /Network/Library/Perl/5.8.6/darwin-thread-multi-2level >> /Network/Library/Perl/5.8.6 >> /Network/Library/Perl >> /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level >> /System/Library/Perl/Extras/5.8.6 >> /Library/Perl/5.8.1 >>
>
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Thu, 15 May 2008 22:20:34 -0500
To: bug-PPI [...] rt.cpan.org
From: Chris Dolan <chris [...] chrisdolan.net>
After a little more digging, this looks like a bug in Perl 5.8.6. The following evidence is pretty damning: Debugging while in PPI::Tokenizer::_process_next_char(), just after the first token is done and we're about to start n the second one. The tokenizer looks like this $self = PPI::Tokenizer=HASH(0x394c4) 'class' => 'PPI::Token::Whitespace' 'line' => '\x{4E00}(); ' 'line_count' => 1 'line_cursor' => 1 'line_length' => 5 'source' => ARRAY(0x1ab72d8) empty array 'source_bytes' => 4 'source_eof_chop' => 1 'token' => undef 'token_cursor' => 0 'token_eof' => 0 'tokens' => ARRAY(0x392b4) 0 PPI::Token::Word=HASH(0x1aefae0) 'content' => '\x{4E00}' 'v6' => ARRAY(0x3a274) empty array 'zone' => 'PPI::Token::Whitespace' Then we execute this line of code: 639: my $char = substr( $self->{line}, $self-> {line_cursor}, 1 ); and get this result: $char = '(); ' WTF??? How does an invocation of substr() with a length of 1 result in a string of length 4? Attached is the simplest failure case I could write to reproduce this bug. In the process I discovered that the error only happens with PPI v1.203. The same test case (and t/ 14_charset.t!) works fine with PPI v1.201. Furthermore, when I run the same test under Perl 5.8.8 on the same Mac, the test passes under 1.201 and 1.203. So, it's definitely a newly exposed bug in Mac OS X 10.4's Perl 5.8.6. I don't have another 5.8.6 handy to test to see if it's Apple's (slightly patched) build or 5.8.6 in general that's flawed. Chris

Message body is not shown because sender requested not to inline it.

Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Fri, 16 May 2008 13:23:22 +1000
To: bug-PPI [...] rt.cpan.org
From: Adam Kennedy <adamkennedybackup [...] gmail.com>
I'll have a look over the weekend if possibly. I note that while PPI does ostensibly support non-ascii inside of strings, it doesn't necessarily support it everywhere. So this might be more of a problem of not aborting the parse when we should... Chris, I also want to take another look at at least adding ::Token::BOM in soon, although I would rather special-case the parsing rather than introducing it as a separate parse class/zone. Adam K Chris Dolan via RT wrote: Show quoted text
> Queue: PPI > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > > > OK, this is perhaps one for Adam to tackle. I've found the > triggering symptom, but not the cause. This is a very interesting case. > > We start with 14_charsets.t > good_ok( '一();', "Function with Chinese > characters" ); > > which at the very end of tokenization looks like this (right after > "the perfect crime!": > > $self = PPI::Tokenizer=HASH(0x394c4) > 'class' => 'PPI::Token::Whitespace' > 'line' => '\x{4E00}(); ' > 'line_count' => 1 > 'line_cursor' => 5 > 'line_length' => 5 > 'source' => ARRAY(0x1ab72d8) > empty array > 'source_bytes' => 4 > 'source_eof_chop' => '' > 'token' => undef > 'token_cursor' => 1 > 'token_eof' => 0 > 'tokens' => ARRAY(0x392b4) > 0 PPI::Token::Word=HASH(0x1aefae0) > 'content' => '\x{4E00}' > 1 PPI::Token::Structure=HASH(0x1afb724) > 'content' => '(); ' > 2 PPI::Token::Structure=HASH(0x1afb7e4) > 'content' => ')' > 3 PPI::Token::Structure=HASH(0x1afb824) > 'content' => ';' > 'v6' => ARRAY(0x3a274) > empty array > 'zone' => 'PPI::Token::Whitespace' > > The key point is that the content field of tokens[1] is '(); ' when > it should be '('. As consequence, that token answers true to both > __LEXER__opens and __LEXER__closes, which triggers an infinite loop > as the lexer repeatedly calls rollback() and pushes more and more > empty statements onto the Document instance, thus using all RAM. > > So, the question is why does the PPI::Token::Structure go awry and > pull in too many characters? I've only tested on Mac OS X 10.4 (the > same as claco) so maybe it's specific to 5.8.6. Dunno yet, input > appreciated. > > Chris > > > > On May 15, 2008, at 8:46 PM, Chris Dolan wrote: >
>> Hi Chris, >> >> Reproduced on my Mac. I'm investigating... >> >> Chris >> >> On May 15, 2008, at 5:58 PM, Christopher H. Laco via RT wrote: >>
>>> Thu May 15 18:58:45 2008: Request 35917 was acted upon. >>> Transaction: Ticket created by CLACO >>> Queue: PPI >>> Subject: charsets.t eats all available VM >>> Broken in: (no value) >>> Severity: (no value) >>> Owner: Nobody >>> Requestors: claco@cpan.org >>> Status: new >>> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >>> >>> >>> t/14/charsets.t eats VM memory until an out of memory error is >>> reached. >>> >>> >>> t/14_charsets......................perl(797) malloc: *** >>> vm_allocate(size=8421376) failed (error code=3) >>> perl(797) malloc: *** error: can't allocate region >>> perl(797) malloc: *** set a breakpoint in szone_error to debug >>> Out of memory! >>> perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) >>> perl(797) malloc: *** error: can't allocate region >>> perl(797) malloc: *** set a breakpoint in szone_error to debug >>> Out of memory! >>> perl(797) malloc: *** vm_allocate(size=8421376) failed (error code=3) >>> perl(797) malloc: *** error: can't allocate region >>> perl(797) malloc: *** set a breakpoint in szone_error to debug >>> Out of memory! >>> Callback called exit. >>> END failed--call queue aborted. >>> >>> The machine has 4GB of memory installed. >>> Here is the output from running chartset.t directly: >>> >>> mbp:~/.cpan/build/PPI-1.203-s5lYEU claco$ perl -Ilib t/14_charsets.t >>> 1..11 >>> ok 1 - Parsed code without accented chars >>> ok 2 - Function with umlaut >>> >>> >>> >>> Here's my perl -V >>> -- >>> claco@mbp ~ $ perl -V >>> Summary of my perl5 (revision 5 version 8 subversion 6) >>> configuration: >>> Platform: >>> osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level >>> uname='darwin b48.apple.com 8.0 darwin kernel version 8.3.0: >>> mon oct >>> 3 20:04:04 pdt 2005; root:xnu-792.6.22.obj~2release_ppc power >>> macintosh >>> powerpc ' >>> config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe >>> -Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib' >>> hint=recommended, useposix=true, d_sigaction=define >>> usethreads=define use5005threads=undef useithreads=define >>> usemultiplicity=define >>> useperlio=define d_sfio=undef uselargefiles=define usesocks=undef >>> use64bitint=undef use64bitall=undef uselongdouble=undef >>> usemymalloc=n, bincompat5005=undef >>> Compiler: >>> cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN >>> -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include', >>> optimize='-O3', >>> cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN >>> -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include' >>> ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5363) >>> (+4864187)', gccosandvers='' >>> intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 >>> d_longlong=define, longlongsize=8, d_longdbl=define, >>> longdblsize=16 >>> ivtype='long', ivsize=4, nvtype='double', nvsize=8, >>> Off_t='off_t', >>> lseeksize=8 >>> alignbytes=8, prototype=define >>> Linker and Libraries: >>> ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/ >>> local/lib' >>> libpth=/usr/local/lib /usr/lib >>> libs=-ldbm -ldl -lm -lc >>> perllibs=-ldl -lm -lc >>> libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, >>> libperl=libperl.dylib >>> gnulibc_version='' >>> Dynamic Linking: >>> dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' >>> cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup >>> -L/usr/local/lib' >>> >>> >>> Characteristics of this binary (from libperl): >>> Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES >>> PERL_IMPLICIT_CONTEXT >>> Locally applied patches: >>> 23953 - fix for File::Path::rmtree CAN-2004-0452 security >>> issue >>> 33990 - fix for setuid perl security issues >>> fix for regcomp CVE-2007-5116 security vulnerability >>> SPRINTF0 - fixes for sprintf formatting issues - >>> CVE-2005-3962 >>> Built under darwin >>> Compiled at Nov 26 2007 09:16:22 >>> %ENV: >>> PERL5LIB="/sw/lib/perl5:/sw/lib/perl5/darwin" >>> @INC: >>> /sw/lib/perl5/5.8.6/darwin-thread-multi-2level >>> /sw/lib/perl5/5.8.6 >>> /sw/lib/perl5/darwin-thread-multi-2level >>> /sw/lib/perl5 >>> /sw/lib/perl5/darwin >>> /System/Library/Perl/5.8.6/darwin-thread-multi-2level >>> /System/Library/Perl/5.8.6 >>> /Library/Perl/5.8.6/darwin-thread-multi-2level >>> /Library/Perl/5.8.6 >>> /Library/Perl >>> /Network/Library/Perl/5.8.6/darwin-thread-multi-2level >>> /Network/Library/Perl/5.8.6 >>> /Network/Library/Perl >>> /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level >>> /System/Library/Perl/Extras/5.8.6 >>> /Library/Perl/5.8.1 >>> >>>
> > >
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Thu, 15 May 2008 23:20:34 -0500
To: bug-PPI [...] rt.cpan.org
From: Chris Dolan <chris [...] chrisdolan.net>
I've got it nailed. It's only triggered in a very, very specific case (Unicode in PPI::Token::Word on last line of the source file) and only affects pre-5.8.8 (not sure about 5.8.7 -- I don't have a build of that version to test). I just committed a SKIP for the offending test if "require 5.008007" fails. It's a small miracle that we even had a test that would trigger this bug. The only reason it started manifesting in PPI v1.203 is because of this change in PPI::Token::Word: http://search.cpan.org/diff?from=PPI-1.201&to=PPI-1.203&w=1#lib/ PPI/Token/Word.pm This is a nearly-minimal program that triggers the bug. #!/usr/bin/perl use utf8; my $s = '一();'; my $line = $s . '!'; substr($line, 1); print substr($line, 1, 1), "\n"; Notes: * The BOM is not relevant. The bug manifests with or without * There must be at least three characters after the Chinese glyph. If less than three, the bug does not manifest * There must be at least one additional character concatenated to the string * The open ended substr must precede the 1 character substr. Wow. Chris On May 15, 2008, at 10:29 PM, Adam Kennedy via RT wrote: Show quoted text
> > Queue: PPI > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > > > I'll have a look over the weekend if possibly. > > I note that while PPI does ostensibly support non-ascii inside of > strings, it doesn't necessarily support it everywhere. > > So this might be more of a problem of not aborting the parse when we > should... > > Chris, I also want to take another look at at least > adding ::Token::BOM > in soon, although I would rather special-case the parsing rather than > introducing it as a separate parse class/zone. > > Adam K > > Chris Dolan via RT wrote:
>> Queue: PPI >> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >> >> OK, this is perhaps one for Adam to tackle. I've found the >> triggering symptom, but not the cause. This is a very interesting >> case. >> >> We start with 14_charsets.t >> good_ok( '一();', "Function with Chinese >> characters" ); >> >> which at the very end of tokenization looks like this (right after >> "the perfect crime!": >> >> $self = PPI::Tokenizer=HASH(0x394c4) >> 'class' => 'PPI::Token::Whitespace' >> 'line' => '\x{4E00}(); ' >> 'line_count' => 1 >> 'line_cursor' => 5 >> 'line_length' => 5 >> 'source' => ARRAY(0x1ab72d8) >> empty array >> 'source_bytes' => 4 >> 'source_eof_chop' => '' >> 'token' => undef >> 'token_cursor' => 1 >> 'token_eof' => 0 >> 'tokens' => ARRAY(0x392b4) >> 0 PPI::Token::Word=HASH(0x1aefae0) >> 'content' => '\x{4E00}' >> 1 PPI::Token::Structure=HASH(0x1afb724) >> 'content' => '(); ' >> 2 PPI::Token::Structure=HASH(0x1afb7e4) >> 'content' => ')' >> 3 PPI::Token::Structure=HASH(0x1afb824) >> 'content' => ';' >> 'v6' => ARRAY(0x3a274) >> empty array >> 'zone' => 'PPI::Token::Whitespace' >> >> The key point is that the content field of tokens[1] is '(); ' when >> it should be '('. As consequence, that token answers true to both >> __LEXER__opens and __LEXER__closes, which triggers an infinite loop >> as the lexer repeatedly calls rollback() and pushes more and more >> empty statements onto the Document instance, thus using all RAM. >> >> So, the question is why does the PPI::Token::Structure go awry and >> pull in too many characters? I've only tested on Mac OS X 10.4 (the >> same as claco) so maybe it's specific to 5.8.6. Dunno yet, input >> appreciated. >> >> Chris >> >> >> >> On May 15, 2008, at 8:46 PM, Chris Dolan wrote: >>
>>> Hi Chris, >>> >>> Reproduced on my Mac. I'm investigating... >>> >>> Chris >>> >>> On May 15, 2008, at 5:58 PM, Christopher H. Laco via RT wrote: >>>
>>>> Thu May 15 18:58:45 2008: Request 35917 was acted upon. >>>> Transaction: Ticket created by CLACO >>>> Queue: PPI >>>> Subject: charsets.t eats all available VM >>>> Broken in: (no value) >>>> Severity: (no value) >>>> Owner: Nobody >>>> Requestors: claco@cpan.org >>>> Status: new >>>> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >>>> >>>> >>>> t/14/charsets.t eats VM memory until an out of memory error is >>>> reached. >>>> >>>> >>>> t/14_charsets......................perl(797) malloc: *** >>>> vm_allocate(size=8421376) failed (error code=3) >>>> perl(797) malloc: *** error: can't allocate region >>>> perl(797) malloc: *** set a breakpoint in szone_error to debug >>>> Out of memory! >>>> perl(797) malloc: *** vm_allocate(size=8421376) failed (error >>>> code=3) >>>> perl(797) malloc: *** error: can't allocate region >>>> perl(797) malloc: *** set a breakpoint in szone_error to debug >>>> Out of memory! >>>> perl(797) malloc: *** vm_allocate(size=8421376) failed (error >>>> code=3) >>>> perl(797) malloc: *** error: can't allocate region >>>> perl(797) malloc: *** set a breakpoint in szone_error to debug >>>> Out of memory! >>>> Callback called exit. >>>> END failed--call queue aborted. >>>> >>>> The machine has 4GB of memory installed. >>>> Here is the output from running chartset.t directly: >>>> >>>> mbp:~/.cpan/build/PPI-1.203-s5lYEU claco$ perl -Ilib t/ >>>> 14_charsets.t >>>> 1..11 >>>> ok 1 - Parsed code without accented chars >>>> ok 2 - Function with umlaut >>>> >>>> >>>> >>>> Here's my perl -V >>>> -- >>>> claco@mbp ~ $ perl -V >>>> Summary of my perl5 (revision 5 version 8 subversion 6) >>>> configuration: >>>> Platform: >>>> osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level >>>> uname='darwin b48.apple.com 8.0 darwin kernel version 8.3.0: >>>> mon oct >>>> 3 20:04:04 pdt 2005; root:xnu-792.6.22.obj~2release_ppc power >>>> macintosh >>>> powerpc ' >>>> config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe >>>> -Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib' >>>> hint=recommended, useposix=true, d_sigaction=define >>>> usethreads=define use5005threads=undef useithreads=define >>>> usemultiplicity=define >>>> useperlio=define d_sfio=undef uselargefiles=define >>>> usesocks=undef >>>> use64bitint=undef use64bitall=undef uselongdouble=undef >>>> usemymalloc=n, bincompat5005=undef >>>> Compiler: >>>> cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN >>>> -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include', >>>> optimize='-O3', >>>> cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN >>>> -no-cpp-precomp -fno-strict-aliasing -I/usr/local/include' >>>> ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build >>>> 5363) >>>> (+4864187)', gccosandvers='' >>>> intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 >>>> d_longlong=define, longlongsize=8, d_longdbl=define, >>>> longdblsize=16 >>>> ivtype='long', ivsize=4, nvtype='double', nvsize=8, >>>> Off_t='off_t', >>>> lseeksize=8 >>>> alignbytes=8, prototype=define >>>> Linker and Libraries: >>>> ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/ >>>> local/lib' >>>> libpth=/usr/local/lib /usr/lib >>>> libs=-ldbm -ldl -lm -lc >>>> perllibs=-ldl -lm -lc >>>> libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, >>>> libperl=libperl.dylib >>>> gnulibc_version='' >>>> Dynamic Linking: >>>> dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, >>>> ccdlflags=' ' >>>> cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup >>>> -L/usr/local/lib' >>>> >>>> >>>> Characteristics of this binary (from libperl): >>>> Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES >>>> PERL_IMPLICIT_CONTEXT >>>> Locally applied patches: >>>> 23953 - fix for File::Path::rmtree CAN-2004-0452 security >>>> issue >>>> 33990 - fix for setuid perl security issues >>>> fix for regcomp CVE-2007-5116 security vulnerability >>>> SPRINTF0 - fixes for sprintf formatting issues - >>>> CVE-2005-3962 >>>> Built under darwin >>>> Compiled at Nov 26 2007 09:16:22 >>>> %ENV: >>>> PERL5LIB="/sw/lib/perl5:/sw/lib/perl5/darwin" >>>> @INC: >>>> /sw/lib/perl5/5.8.6/darwin-thread-multi-2level >>>> /sw/lib/perl5/5.8.6 >>>> /sw/lib/perl5/darwin-thread-multi-2level >>>> /sw/lib/perl5 >>>> /sw/lib/perl5/darwin >>>> /System/Library/Perl/5.8.6/darwin-thread-multi-2level >>>> /System/Library/Perl/5.8.6 >>>> /Library/Perl/5.8.6/darwin-thread-multi-2level >>>> /Library/Perl/5.8.6 >>>> /Library/Perl >>>> /Network/Library/Perl/5.8.6/darwin-thread-multi-2level >>>> /Network/Library/Perl/5.8.6 >>>> /Network/Library/Perl >>>> /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level >>>> /System/Library/Perl/Extras/5.8.6 >>>> /Library/Perl/5.8.1 >>>> >>>>
>> >> >>
> >
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Fri, 16 May 2008 14:29:43 +1000
To: bug-PPI [...] rt.cpan.org
From: Adam Kennedy <adamkennedybackup [...] gmail.com>
That is completely nuts... Fortunately (from an avoidance point of view) we don't allow unicode in words, so at least we can play the "not supported" card. I wonder if we could actually fix this by using a regex with a preset pos() instead of an open-ended substr. Adam K Chris Dolan via RT wrote: Show quoted text
> Queue: PPI > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > > > I've got it nailed. It's only triggered in a very, very specific > case (Unicode in PPI::Token::Word on last line of the source file) > and only affects pre-5.8.8 (not sure about 5.8.7 -- I don't have a > build of that version to test). I just committed a SKIP for the > offending test if "require 5.008007" fails. It's a small miracle > that we even had a test that would trigger this bug. The only reason > it started manifesting in PPI v1.203 is because of this change in > PPI::Token::Word: > http://search.cpan.org/diff?from=PPI-1.201&to=PPI-1.203&w=1#lib/ > PPI/Token/Word.pm > > This is a nearly-minimal program that triggers the bug. > > #!/usr/bin/perl > use utf8; > my $s = '一();'; > my $line = $s . '!'; > substr($line, 1); > print substr($line, 1, 1), "\n"; > > > Notes: > * The BOM is not relevant. The bug manifests with or without > * There must be at least three characters after the Chinese glyph. > If less than three, the bug does not manifest > * There must be at least one additional character concatenated to > the string > * The open ended substr must precede the 1 character substr. > > Wow. > > Chris
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Thu, 15 May 2008 23:59:40 -0500
To: bug-PPI [...] rt.cpan.org
From: Chris Dolan <chris [...] chrisdolan.net>
Adam, I thought of something like that (although I thought of a less elegant solution of inserting N "." characters at the start of the regex). But, neither solution solves the problem (I just tested now). We could hack the test case by removing the trailing ";" character -- that does work and buries the problem successfully. :-/ Unless we can find a better workaround in Word.pm, I prefer to leave it broken and just skip the test on known- bad Perl versions. I wrote about this in my journal: http://use.perl.org/~ChrisDolan/journal/36438 Thank you, Chris L., for providing such an entertaining distraction for this evening. To think, I had planned on getting actual work done. :-) Chris On May 15, 2008, at 11:35 PM, Adam Kennedy via RT wrote: Show quoted text
> > Queue: PPI > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > > > That is completely nuts... > > Fortunately (from an avoidance point of view) we don't allow > unicode in > words, so at least we can play the "not supported" card. > > I wonder if we could actually fix this by using a regex with a preset > pos() instead of an open-ended substr. > > Adam K > > Chris Dolan via RT wrote:
>> Queue: PPI >> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >> >> I've got it nailed. It's only triggered in a very, very specific >> case (Unicode in PPI::Token::Word on last line of the source file) >> and only affects pre-5.8.8 (not sure about 5.8.7 -- I don't have a >> build of that version to test). I just committed a SKIP for the >> offending test if "require 5.008007" fails. It's a small miracle >> that we even had a test that would trigger this bug. The only reason >> it started manifesting in PPI v1.203 is because of this change in >> PPI::Token::Word: >> http://search.cpan.org/diff?from=PPI-1.201&to=PPI-1.203&w=1#lib/ >> PPI/Token/Word.pm >> >> This is a nearly-minimal program that triggers the bug. >> >> #!/usr/bin/perl >> use utf8; >> my $s = '一();'; >> my $line = $s . '!'; >> substr($line, 1); >> print substr($line, 1, 1), "\n"; >> >> >> Notes: >> * The BOM is not relevant. The bug manifests with or without >> * There must be at least three characters after the Chinese glyph. >> If less than three, the bug does not manifest >> * There must be at least one additional character concatenated to >> the string >> * The open ended substr must precede the 1 character substr. >> >> Wow. >> >> Chris
> >
Subject: Re: [rt.cpan.org #35917] charsets.t eats all available VM
Date: Fri, 16 May 2008 16:24:03 +1000
To: bug-PPI [...] rt.cpan.org
From: Adam Kennedy <adamkennedybackup [...] gmail.com>
The other alternative, if the unicode ends up in a ::Word, is to add a check to the Word parser somewhere that checks to make sure that there are no non-ascii elements (since there shouldn't be...). Or we have the tokenizer, during it's pre-parse check for invalid characters, refuse to parse anything with non-ascii unless the Perl version is 5.8.7 or newer (I like this option more, as there's no per-word cpu penalty). Adam K Chris Dolan via RT wrote: Show quoted text
> Queue: PPI > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > > > Adam, > > I thought of something like that (although I thought of a less > elegant solution of inserting N "." characters at the start of the > regex). But, neither solution solves the problem (I just tested > now). We could hack the test case by removing the trailing ";" > character -- that does work and buries the problem > successfully. :-/ Unless we can find a better workaround in > Word.pm, I prefer to leave it broken and just skip the test on known- > bad Perl versions. > > I wrote about this in my journal: > http://use.perl.org/~ChrisDolan/journal/36438 > > Thank you, Chris L., for providing such an entertaining distraction > for this evening. To think, I had planned on getting actual work > done. :-) > > Chris > > > On May 15, 2008, at 11:35 PM, Adam Kennedy via RT wrote: >
>> Queue: PPI >> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >> >> That is completely nuts... >> >> Fortunately (from an avoidance point of view) we don't allow >> unicode in >> words, so at least we can play the "not supported" card. >> >> I wonder if we could actually fix this by using a regex with a preset >> pos() instead of an open-ended substr. >> >> Adam K >> >> Chris Dolan via RT wrote: >>
>>> Queue: PPI >>> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35917 > >>> >>> I've got it nailed. It's only triggered in a very, very specific >>> case (Unicode in PPI::Token::Word on last line of the source file) >>> and only affects pre-5.8.8 (not sure about 5.8.7 -- I don't have a >>> build of that version to test). I just committed a SKIP for the >>> offending test if "require 5.008007" fails. It's a small miracle >>> that we even had a test that would trigger this bug. The only reason >>> it started manifesting in PPI v1.203 is because of this change in >>> PPI::Token::Word: >>> http://search.cpan.org/diff?from=PPI-1.201&to=PPI-1.203&w=1#lib/ >>> PPI/Token/Word.pm >>> >>> This is a nearly-minimal program that triggers the bug. >>> >>> #!/usr/bin/perl >>> use utf8; >>> my $s = '一();'; >>> my $line = $s . '!'; >>> substr($line, 1); >>> print substr($line, 1, 1), "\n"; >>> >>> >>> Notes: >>> * The BOM is not relevant. The bug manifests with or without >>> * There must be at least three characters after the Chinese glyph. >>> If less than three, the bug does not manifest >>> * There must be at least one additional character concatenated to >>> the string >>> * The open ended substr must precede the 1 character substr. >>> >>> Wow. >>> >>> Chris >>>
>>
> > >
Subject: Still not fixed [rt.cpan.org #35917]
Date: Sat, 10 Jan 2009 14:18:58 +0500
To: bug-PPI [...] rt.cpan.org
From: Alexander Krasnorutsky <krasnoroot [...] mail.ru>
Hello! This bug is still not fixed (PPI-1.203). Should the workaround (increasing the minimum version of perl required for running this test) be applied to the version of PPI which available for download from CPAN? Best regards -- Alexander Krasnorutsky.
Hello! I'm new to CPAN, but was testing the bug prog above and I think I have a fix. It involves utf8.pm: After package utf8; add use Encode; After sub AUTOLOAD... add sub print { BEGIN { utf8::import() } return CORE::print (encode('utf-8', @_[0])); } I still get the wide character warnings, but substr works. Michael On Sat Jan 10 04:17:07 2009, krasnoroot@mail.ru wrote: Show quoted text
> Hello! > > This bug is still not fixed (PPI-1.203). > Should the workaround (increasing the minimum version of perl
required Show quoted text
> for running this test) be applied to the version of PPI which
available Show quoted text
> for download from CPAN? > > Best regards > -- > > Alexander Krasnorutsky.
Opps, spoke a little soon. Those mods failed but this works: #!/usr/bin/perl use utf8; use Encode; # encode is the fix! my $s = '一();'; my $line = $s . '!'; print encode('utf-8', $line)."\n"; substr($line, 1); $line = substr($line, 1, 0); print encode('utf-8',$line)."\n"; print encode('utf-8',substr($line, 0, 1))."\n"; print encode('utf-8',substr($line, 1, 1))."\n"; result: 一();! (); ( ) I'm on perl 5.8.5. Adding this sub to utf8.pm also works, removing the need to have encode in the prog above: sub print { BEGIN { utf8::import() } return CORE::print (Encode::encode('utf-8', $_[0])); } Although I am sure I am not implimenting print correctly. Anyway, I hope this helps. Michael On Mon May 25 13:10:35 2009, MIKEWHOO wrote: Show quoted text
> Hello! > > I'm new to CPAN, but was testing the bug prog above and I think I
have Show quoted text
> a fix. It involves utf8.pm: > > After package utf8; > add use Encode; > > After sub AUTOLOAD... > add > > sub print { > BEGIN { utf8::import() } > return > CORE::print (encode('utf-8', @_[0])); > } > > I still get the wide character warnings, but substr works. > > Michael > > On Sat Jan 10 04:17:07 2009, krasnoroot@mail.ru wrote:
> > Hello! > > > > This bug is still not fixed (PPI-1.203). > > Should the workaround (increasing the minimum version of perl
> required
> > for running this test) be applied to the version of PPI which
> available
> > for download from CPAN? > > > > Best regards > > -- > > > > Alexander Krasnorutsky.
> > >
This was fixed by a patch to t/14_charsets.t