Skip Menu |

This queue is for tickets about the Text-ASCIITable CPAN distribution.

Report information
The Basics
Id: 64382
Status: resolved
Priority: 0/
Queue: Text-ASCIITable

People
Owner: haakon [...] _NOSPAM_loopback.no
Requestors: mathias [...] mathias-ewald.de
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Bug with Umlauts in Text::ASCIITable
Date: Sat, 1 Jan 2011 18:15:36 +0100
To: bug-Text-ASCIITable [...] rt.cpan.org
From: Mathias Ewald <mathias [...] mathias-ewald.de>
Hi, I found a bug in Text::ASCIITable concerning German Umlauts (ä, ö, ü). It seems like Umlauts are not considered for the column width. This results in displaced column lines. See this example: +---------+------------+-------------------------+-----+------------+------+------+ | 2009-01 | 2009-05-01 | *****, ******* | 0 | 2009-05-15 | 1 | 1 | | 2009-02 | 2008-04-18 | ******** **** | 0 | 2008-06-02 | 1 | 1 | | 2009-03 | 2009-06-01 | *****, ******* | 0 | 2009-06-15 | 1 | 1 | | 2009-04 | 2009-06-30 | ******** **** | 0 | 2009-08-14 | 1 | 1 | | 2009-06 | 2009-07-31 | ******** **** | 0 | 2009-09-14 | 1 | 1 | | 2009-07 | 2009-08-06 | *****, ******* | 0 | 2009-08-20 | 1 | 1 | | 2009-08 | 2009-09-05 | *****, ******* | 0 | 2009-09-19 | 1 | 1 | | 2009-09 | 2009-09-05 | *****, ******* | 0 | 2009-09-19 | 1 | 1 | | 2009-10 | 2009-10-13 | ******** **** | 0 | 2009-11-26 | 1 | 1 | | 2009-11 | 2009-11-06 | ******** **** | 0 | 2009-12-21 | 1 | 1 | | 2009-12 | 2009-11-06 | *****, ******* | 0 | 2009-11-20 | 1 | 1 | | 2009-13 | 2009-11-19 | ******** **** | 0 | 2010-01-03 | 1 | 1 | | 2009-14 | 2009-12-05 | *****, ******* | 0 | 2009-12-19 | 1 | 1 | | 2010-01 | 2010-01-01 | *****, ******* | 19 | 2010-01-15 | 1 | 1 | | 2010-02 | 2010-02-03 | *****, ******* | 19 | 2010-02-17 | 1 | 1 | | 2010-03 | 2010-03-16 | *********, ****** | 19 | 2010-03-30 | 1 | 1 | | 2010-04 | 2010-03-26 | *ö*****, **** | 19 | 2010-05-10 | 1 | 1 | | 2010-05 | 2010-04-24 | *ö*****, **** | 19 | 2010-06-08 | 1 | 1 | | 2010-06 | 2010-05-01 | *****, ******* | 19 | 2010-05-15 | 1 | 1 | | 2010-07 | 2010-05-30 | *ö*****, **** | 19 | 2010-07-14 | 1 | 1 | | 2010-08 | 2010-06-10 | *ö*****, **** | 19 | 2010-07-25 | 1 | 1 | | 2010-09 | 2010-07-01 | *ö*****, **** | 19 | 2010-08-15 | 1 | 1 | | 2010-10 | 2010-06-01 | *****, ******* | 19 | 2010-06-15 | 1 | 1 | | 2010-11 | 2010-07-01 | *****, ******* | 19 | 2010-07-15 | 1 | 1 | | 2010-12 | 2010-07-26 | *ö***** ******* **** | 19 | 2010-09-09 | 1 | 1 | | 2010-13 | 2010-07-26 | *****, ******* | 19 | 2010-08-09 | 1 | 1 | | 2010-14 | 2010-07-30 | *ö***** ******* **** | 19 | 2010-09-13 | 1 | 1 | | 2010-15 | 2010-08-09 | *ö***** ******* **** | 19 | 2010-09-23 | 1 | 1 | | 2010-16 | 2010-08-20 | *ö***** ******* **** | 19 | 2010-10-04 | 1 | 1 | | 2010-17 | 2010-08-26 | *ö***** ******* **** | 19 | 2010-10-10 | 1 | 1 | | 2010-18 | 2010-08-22 | *****, ******* | 19 | 2010-09-05 | 1 | 1 | | 2010-19 | 2010-09-10 | *ö***** ******* **** | 19 | 2010-10-25 | 1 | 1 | | 2010-20 | 2010-09-24 | *ö***** ******* **** | 19 | 2010-11-07 | 1 | 1 | | 2010-21 | 2010-09-29 | ********* *ü* ******** | 19 | 2010-11-12 | 1 | 1 | | 2010-22 | 2010-10-15 | *ö***** ******* **** | 19 | 2010-11-28 | 0 | 1 | | 2010-23 | 2010-10-25 | *****, ******* | 19 | 2010-11-07 | 1 | 1 | | 2010-24 | 2010-10-25 | *****, ******* | 19 | 2010-11-07 | 1 | 1 | | 2010-25 | 2010-11-05 | *** *ü****** | 0 | 2010-11-19 | 1 | 1 | | 2010-26 | 2010-11-09 | *****, ******* | 19 | 2010-11-23 | 1 | 1 | | 2010-27 | 2010-11-26 | *ö***** ******* **** | 19 | 2011-01-10 | 0 | 1 | | 2010-28 | 2010-12-03 | *ö***** ******* **** | 19 | 2011-01-17 | 0 | 1 | | 2010-29 | 2010-12-17 | *ö***** ******* **** | 19 | 2011-01-31 | 0 | 0 | '---------+------------+-------------------------+-----+------------+------+------' I anonymized the content of this table. Every * stands for a non-umlaut letter. You can see, that whenever there is a Umlaut in the 3rd columns, the column width is one less than it should be. Looking forward to your response! cheers Mathias Ewald -- ---------------------------------------------------------------- Mathias Ewald | Landline: +49 911 495208941 Heerwagenstraße 29 | Mobile: +49 151 17317864 90489 Nuremberg | Email: mathias@mathias-ewald.de Germany | Website: http://mathias-ewald.de | Skype: mathias.ewald ----------------------------------------------------------------
Download signature.asc
application/pgp-signature 198b

Message body not shown because it is not plain text.

Attached a patch, which makes this module utf8-aware.
Subject: patch.diff
diff --git a/lib/site_perl/5.12.3/Text/ASCIITable.pm b/lib/site_perl/5.12.3/Text/ASCIITable.pm index 96e9bc8..7338ade 100644 --- a/lib/site_perl/5.12.3/Text/ASCIITable.pm +++ b/lib/site_perl/5.12.3/Text/ASCIITable.pm @@ -4,13 +4,15 @@ package Text::ASCIITable; @ISA=qw(Exporter); @EXPORT = qw(); @EXPORT_OK = qw(); -$VERSION = '0.18'; +$VERSION = '0.18_1'; use Exporter; use strict; use Carp; use Text::ASCIITable::Wrap qw{ wrap }; use overload '@{}' => 'addrow_overload', '""' => 'drawit'; +use Encode qw( decode encode ); + =head1 NAME Text::ASCIITable - Create a nice formatted table using ASCII characters. @@ -86,6 +88,7 @@ sub new { $self->{options}{alignHeadRow} = $self->{options}{alignHeadRow} || 'auto'; # default setting $self->{options}{undef_as} = $self->{options}{undef_as} || ''; # default setting $self->{options}{chaining} = $self->{options}{chaining} || 0; # default setting + $self->{options}{utf8} = $self->{options}{utf8} || 1; # default setting bless $self; @@ -615,7 +618,9 @@ sub drawRow { my $contents = $start; for (my $i=0;$i<scalar(@{$row});$i++) { my $colwidth = $self->getColWidth(@{$self->{tbl_cols}}[$i]); + my $text = @{$row}[$i]; + if ($isheader != 1 && defined($self->{tbl_align}{@{$self->{tbl_cols}}[$i]})) { $contents .= ' '.$self->align( @@ -899,16 +904,17 @@ sub count { $str =~ s/<.+?>//g if $self->{options}{allowHTML}; $str =~ s/\33\[(\d+(;\d+)?)?[musfwhojBCDHRJK]//g if $self->{options}{allowANSI}; # maybe i should only have allowed ESC[#;#m and not things not related to $str =~ s/\33\([0B]//g if $self->{options}{allowANSI}; # color/bold/underline.. But I want to give people as much room as they need. - + $str = decode("utf8", $str) if $self->{options}{utf8}; + #print "DEBUG: $str / " . length($str) . "\n" if $str =~ m/ü/; return length($str); } sub align { my ($self,$text,$dir,$length,$strict) = @_; - + $text = decode("utf8", $text) if $self->{options}{utf8}; if ($dir =~ /auto/i) { - if ($text =~ /^-?\d+(\.\d+)*[%\w]?$/) { + if ($text =~ /^-?\d+(\.\d+)*[%\w]?$/) { # TODO: Allow , instead of . $dir = 'right'; } else { $dir = 'left'; @@ -921,23 +927,25 @@ sub align { return $ret; } elsif ($dir =~ /right/i) { $text = (" " x ($length - $self->count($text))).$text; - return substr($text,0,$length) if ($strict); + $text = substr($text,0,$length) if ($strict); + $text = encode("utf8", $text) if $self->{options}{utf8}; return $text; } elsif ($dir =~ /left/i) { $text = $text.(" " x ($length - $self->count($text))); - return substr($text,0,$length) if ($strict); + $text = substr($text,0,$length) if ($strict); + $text = encode("utf8", $text) if $self->{options}{utf8}; return $text; } elsif ($dir =~ /justify/i) { + $text = substr($text,0,$length) if ($strict); - if (length($text) < $length) { + if ($self->_length($text) < $length) { $text =~ s/^\s+//; # trailing whitespace $text =~ s/\s+$//; # tailing whitespace - my @tmp = split(/\s+/,$text); # split them words + my @tmp = split(/\s+/, $text); # split them words if (scalar(@tmp)) { - my $extra = $length - length(join('',@tmp)); # Length of text without spaces - + my $extra = $length - $self->_length(join('',@tmp)); # Length of text without spaces my $modulus = $extra % (scalar(@tmp)); # modulus $extra = int($extra / (scalar(@tmp))); # for each word @@ -965,6 +973,16 @@ sub align { } } +sub _length { + my $self = shift; + my $str = shift; + if ( $self->{options}{utf8} ) { + return length( decode("utf8", $str) ); + } else { + return length($str); + } +} + sub TIEARRAY { my $self = shift;
On Wed Nov 16 08:50:01 2011, LANTI wrote: Show quoted text
> Attached a patch, which makes this module utf8-aware.
Hi, Thanks for your patch, but instead of all the work in your patch, the only function that needs to be changed is the count() function. I will upload 0.19 today, which will have UTF-8 awareness added.