Bug #53197 for Unicode-Normalize: NFKC("\x{2000}") produces "\x20\x05" on some perls >= 5.11.2

Mon Dec 28 18:52:06 2009 CFAERBER [...] cpan.org - Ticket created

Subject:

NFKC("\x{2000}") produces "\x20\x05" on some perls >= 5.11.2

Hi. Sometimes, NFKC("\x{2000}") produces an extra "\x05" in the output.

This problem does seem to be isolated to some platforms. It has been observed with U::N 1.03
running some amd64 operating system; I'm not sure whether it also occurs with U::N 1.05.

Please find a test case attached.

Subject:

perl-5.11.2.t

use strict; use utf8; no warnings 'utf8'; use Test::More tests => 1; use Unicode::Normalize(); is( Unicode::Normalize::NFKC("\x{2000}"), " ", 'NFKC of U+2000' );

Mon Dec 28 18:54:15 2009 CFAERBER [...] cpan.org - Correspondence added

As I've discovered the problem with test vectors for Net::IDN::Encode/Unicode::Stringprep, some CPAN tests are available here:

http://matrix.cpantesters.org/?dist=Unicode-Stringprep%201.09_70091230
http://matrix.cpantesters.org/?dist=Unicode-Stringprep%201.02 (these two are the most interesting versions, please ignore the experimental versions 1.09_2009????)

The problem occurs in these tests as (N.B. the ^E is not visible):

#   Failed test 'Non-ASCII multibyte space character U+2000'
#   at t/nameprep_st.t line 258.
#          got: ' '
#     expected: ' '

#   Failed test 'Larger test (shrinking)'
#   at t/nameprep_st.t line 258.
#          got: 'xssi̇telǰ aΰ '
#     expected: 'xssi̇telǰ aΰ '

Mon Dec 28 18:54:16 2009 CFAERBER [...] cpan.org - Status changed from 'new' to 'open'

Tue Dec 29 04:48:58 2009 CFAERBER [...] cpan.org - Correspondence added

The prime suspect is now the generated file lib/unicode/Decomposition.pl in bleadperl:

2000		2002
2001		2003
2002	2006	 0020 # [5]
2007		 0020
2008	200A	 0020 # [3]

Probably there's no fix required for Unicode::Normalize. I'll write a patch for perl, then.

Wed Jan 20 20:02:09 2010 CFAERBER [...] cpan.org - Correspondence added

It's fixed in bleadperl/5.11.4

Wed Jan 20 20:02:10 2010 CFAERBER [...] cpan.org - Status changed from 'open' to 'resolved'