Bug #31862 for Unicode-EastAsianWidth: Makefile.PL generates invalid EastAsianWidth table

Mon Dec 24 22:04:06 2007 banb [...] cpan.org - Ticket created

Subject:

Makefile.PL generates invalid EastAsianWidth table

Dear maintainer. Makefile.PL has two bugs, so bundled Unicode::EastAsianWidth contains invalid East Asian Width tables. 1) HEREDOC <<'END' does not recognize \t (tab) char You changed HEREDOC terminator from <<END to <<'END', so \p{...} does not have correct character range. 2) Parsing regex doesn't contain line-beginning match pattern Character range (ex. 4E00..9FBB) couldn't be recognized correctly. Attached patch works fine for me (and you have to re-generate lib/Unicode/EastAsianWidth.pm). One test script also attached. (I'm not familiar with every East Asian Characters. This test script contains only Japanese characters). Regards. -- BANB: ITO Nobuaki

Subject:

UnicodeEAW.patch

--- Makefile.PL.orig 2007-10-14 17:02:39.000000000 +0900 +++ Makefile.PL 2007-12-25 11:47:35.000000000 +0900 @@ -66,7 +66,7 @@ my %categ; while (<EAW>) { - if (/(\w+);(\w+)/) { + if (/^(\w+);(\w+)/) { my ($code, $categ) = ($1, $2); if ($prev_categ ne $categ) { $categ{$ToFullName{$prev_categ}} .= "$prev_code\\t$prev_code_end\n" if $prev_categ; @@ -75,7 +75,7 @@ } $prev_code_end = $code; } - elsif (/(\w+)\.\.(\w+);(\w+)/) { + elsif (/^(\w+)\.\.(\w+);(\w+)/) { $categ{$ToFullName{$prev_categ}} .= "$prev_code\\t$prev_code_end\n" if $prev_categ; $categ{$ToFullName{$3}} .= "$1\\t$2\n"; $prev_categ = ''; @@ -97,7 +97,7 @@ for my $name (sort values %ToFullName) { $out .= << "."; sub $name { - return <<'END'; + return <<"END"; $categ{$name}END }

Subject:

2-chars.t

#!/usr/bin/perl -w use strict; use warnings; use Test::Simple tests => 7; use Unicode::EastAsianWidth; # LATIN CAPITAL LETTER B ok("B" =~ m/\p{InEastAsianNarrow}/, "East Asian Narrow"); # FULLWIDTH LATIN CAPITAL LETTER B ok("\x{ff22}" =~ m/\p{InEastAsianFullwidth}/, "East Asian Full-width"); # HALFWIDTH KATAKANA LETTER I ok("\x{ff72}" =~ m/\p{InEastAsianHalfwidth}/, "East Asian Half-width"); # KATAKANA LETTER I ok("\x{30a4}" =~ m/\p{InEastAsianWide}/, "East Asian Wide"); # KANJI EI ok("\x{6c38}" =~ m/\p{InEastAsianWide}/, "East Asian Wide"); # ROMAN NUMERAL FOUR ok("\x{2163}" =~ m/\p{InEastAsianAmbiguous}/, "East Asian Ambiguous"); # THAI CHARACTER PHO SAMPHAO ok("\x{0e20}" !~ m/\p{InEastAsianHalfwidth}/, "Not East Asian"); __END__

Fri Feb 08 17:26:48 2008 cpan [...] audreyt.org - Correspondence added

Thanks, resolved in 1.30 (I missed your ticket before -- sorry!).

Fri Feb 08 17:26:50 2008 The RT System itself - Status changed from 'new' to 'open'

Fri Feb 08 17:26:51 2008 cpan [...] audreyt.org - Status changed from 'open' to 'resolved'