Skip Menu |

This queue is for tickets about the Text-CharWidth CPAN distribution.

Report information
The Basics
Id: 85091
Status: new
Priority: 0/
Queue: Text-CharWidth

People
Owner: Nobody in particular
Requestors: daxim [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.04
Fixed in: (no value)



Subject: mbwidth broken for many Unicode characters
Demo program: #!perl use warnings FATAL => 'all'; use Data::Dump qw(pp); use Text::CharWidth qw(mbwidth); use Unicode::GCString qw(); use Unicode::UCD qw(charinfo); for (0 .. 0x1ffff) { my $c = eval sprintf '"\\x{%x}"', $_; my $u = eval {Unicode::GCString->new($c)}; printf "%s <http://codepoints.net/U+%s> "%s" mbw %d col %d charinfo %s\n", (mbwidth($c) == $u->columns) ? 'ok' : 'nok', sprintf('%x', $_), charinfo($_)-> mbwidth($c), $u->columns, pp(charinfo($_)), if defined($u) && charinfo($_); # skip non-character codepoints } __END__ Invoked as perl width.pl | ack ^nok | wc -l shows 7623 disagreements on my installation of Perl 5.16.3 on glibc 2.17. As mbwidth tends to return -1 for many letter characters, I trust Unicode::GCString->columns to provide the correct answer. The documentation should at least mention this large disagreement, though it's probably better in the long term to deprecate Text::CharWidth altogether and redirect users to use Unicode::GCString instead.