Skip Menu |

This queue is for tickets about the Unicode-EastAsianWidth CPAN distribution.

Report information
The Basics
Id: 31862
Status: resolved
Priority: 0/
Queue: Unicode-EastAsianWidth

People
Owner: Nobody in particular
Requestors: banb [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 1.10
Fixed in: (no value)



Subject: Makefile.PL generates invalid EastAsianWidth table
Dear maintainer. Makefile.PL has two bugs, so bundled Unicode::EastAsianWidth contains invalid East Asian Width tables. 1) HEREDOC <<'END' does not recognize \t (tab) char You changed HEREDOC terminator from <<END to <<'END', so \p{...} does not have correct character range. 2) Parsing regex doesn't contain line-beginning match pattern Character range (ex. 4E00..9FBB) couldn't be recognized correctly. Attached patch works fine for me (and you have to re-generate lib/Unicode/EastAsianWidth.pm). One test script also attached. (I'm not familiar with every East Asian Characters. This test script contains only Japanese characters). Regards. -- BANB: ITO Nobuaki
Subject: UnicodeEAW.patch
--- Makefile.PL.orig 2007-10-14 17:02:39.000000000 +0900 +++ Makefile.PL 2007-12-25 11:47:35.000000000 +0900 @@ -66,7 +66,7 @@ my %categ; while (<EAW>) { - if (/(\w+);(\w+)/) { + if (/^(\w+);(\w+)/) { my ($code, $categ) = ($1, $2); if ($prev_categ ne $categ) { $categ{$ToFullName{$prev_categ}} .= "$prev_code\\t$prev_code_end\n" if $prev_categ; @@ -75,7 +75,7 @@ } $prev_code_end = $code; } - elsif (/(\w+)\.\.(\w+);(\w+)/) { + elsif (/^(\w+)\.\.(\w+);(\w+)/) { $categ{$ToFullName{$prev_categ}} .= "$prev_code\\t$prev_code_end\n" if $prev_categ; $categ{$ToFullName{$3}} .= "$1\\t$2\n"; $prev_categ = ''; @@ -97,7 +97,7 @@ for my $name (sort values %ToFullName) { $out .= << "."; sub $name { - return <<'END'; + return <<"END"; $categ{$name}END }
Subject: 2-chars.t
#!/usr/bin/perl -w use strict; use warnings; use Test::Simple tests => 7; use Unicode::EastAsianWidth; # LATIN CAPITAL LETTER B ok("B" =~ m/\p{InEastAsianNarrow}/, "East Asian Narrow"); # FULLWIDTH LATIN CAPITAL LETTER B ok("\x{ff22}" =~ m/\p{InEastAsianFullwidth}/, "East Asian Full-width"); # HALFWIDTH KATAKANA LETTER I ok("\x{ff72}" =~ m/\p{InEastAsianHalfwidth}/, "East Asian Half-width"); # KATAKANA LETTER I ok("\x{30a4}" =~ m/\p{InEastAsianWide}/, "East Asian Wide"); # KANJI EI ok("\x{6c38}" =~ m/\p{InEastAsianWide}/, "East Asian Wide"); # ROMAN NUMERAL FOUR ok("\x{2163}" =~ m/\p{InEastAsianAmbiguous}/, "East Asian Ambiguous"); # THAI CHARACTER PHO SAMPHAO ok("\x{0e20}" !~ m/\p{InEastAsianHalfwidth}/, "Not East Asian"); __END__
Thanks, resolved in 1.30 (I missed your ticket before -- sorry!).