Skip Menu |

This queue is for tickets about the TAP-Formatter-JUnit CPAN distribution.

Report information
The Basics
Id: 112446
Status: open
Priority: 0/
Queue: TAP-Formatter-JUnit

People
Owner: Nobody in particular
Requestors: victor [...] vsespb.ru
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.11
Fixed in: (no value)



Subject: Does not work with non-ascii ouput
i.e. when tests have non-ascii messages like: ok 1, "привет"; # Russian message attaching patch (for version 0.11).
Subject: unicode.patch
diff --git a/lib/TAP/Formatter/JUnit.pm b/lib/TAP/Formatter/JUnit.pm index 30436a2..be611e3 100644 --- a/lib/TAP/Formatter/JUnit.pm +++ b/lib/TAP/Formatter/JUnit.pm @@ -7,6 +7,7 @@ extends qw( ); use XML::Generator; +use Encode; use TAP::Formatter::JUnit::Session; our $VERSION = '0.11'; @@ -62,7 +63,7 @@ sub summary { return if $self->silent(); my @suites = @{$self->testsuites}; - print { $self->stdout } $self->xml->testsuites( @suites ); + print { $self->stdout } Encode::encode("UTF-8", $self->xml->testsuites( @suites )); } 1; diff --git a/lib/TAP/Formatter/JUnit/Session.pm b/lib/TAP/Formatter/JUnit/Session.pm index fb7dca3..6ad90ee 100644 --- a/lib/TAP/Formatter/JUnit/Session.pm +++ b/lib/TAP/Formatter/JUnit/Session.pm @@ -6,6 +6,7 @@ extends qw( TAP::Formatter::Console::Session ); +use Encode; use Storable qw(dclone); use File::Path qw(mkpath); use IO::File; @@ -72,6 +73,7 @@ sub result { 'time' => $self->get_time, 'result' => $result, ); + $wrapped->{result}{$_} = Encode::decode("UTF-8", $wrapped->{result}{$_}) for (qw/description raw/); $self->_queue_add($wrapped); } } @@ -361,8 +363,6 @@ sub _squeaky_clean { my $string = shift; # control characters (except CR and LF) $string =~ s/([\x00-\x09\x0b\x0c\x0e-\x1f])/"^".chr(ord($1)+64)/ge; - # high-byte characters - $string =~ s/([\x7f-\xff])/'[\\x'.sprintf('%02x',ord($1)).']'/ge; return $string; }
On 2016-02-26 03:05:24, vsespb wrote: Show quoted text
> i.e. when tests have non-ascii messages like: > > ok 1, "привет"; # Russian message > > attaching patch (for version 0.11). >
I wonder if a better solution would be to use an encoding layer on stdout. Also, what if the current terminal is not a utf8 one? Probably one should look at the current locale to decide whether to output utf8 or something different. Also this removal does not look right: - $string =~ s/([\x7f-\xff])/'[\\x'.sprintf('%02x',ord($1)).']'/ge; There are control characters in the range \x80-\x9f which should be handled.
Show quoted text
> I wonder if a better solution would be to use an encoding layer on stdout.
Okay. The bug persists (and fix fixes it) for any valid test, which prints UTF-8 to console. i.e. for this too: === use utf8; use Test::More; use Test::Builder (); binmode $_, ':encoding(UTF-8)' for map { Test::Builder->new->$_ } qw(output failure_output); ok 1, "привет"; # Russian message done_testing; === Show quoted text
> Also, what if the current terminal is not a utf8 one?
Yes, I agree here. This only for UTF-8. (except, maybe, one can think that XML with UTF-8 header is not a text, but binary data, and should not be printed as text at all - i.e. should be printed to UTF8 console only). Show quoted text
> There are control characters in the range \x80-\x9f which should be handled.
Indeed! Ok, let's think of this patch as a proof-of bug and example of fix. On Sun Feb 28 21:50:13 2016, SREZIC wrote: Show quoted text
> On 2016-02-26 03:05:24, vsespb wrote:
> > i.e. when tests have non-ascii messages like: > > > > ok 1, "привет"; # Russian message > > > > attaching patch (for version 0.11). > >
> > I wonder if a better solution would be to use an encoding layer on > stdout. > > Also, what if the current terminal is not a utf8 one? Probably one > should look at the current locale to decide whether to output utf8 or > something different. > > Also this removal does not look right: > > - $string =~ s/([\x7f-\xff])/'[\\x'.sprintf('%02x',ord($1)).']'/ge; > > There are control characters in the range \x80-\x9f which should be > handled.
Show quoted text
> I wonder if a better solution would be to use an encoding layer on stdout.
If you meant use binmode STDOUT, ":utf8", instead of "print { $self->stdout } Encode::encode("UTF-8", $self->xml->testsuites( @suites ));" I think no, it's better when _only_ main program (.pl script; not .pm) changes STDOUT layers, otherwise it's a mess. And btw, even if we do (like in patch) === print { $self->stdout } Encode::encode("UTF-8", $self->xml->testsuites( @suites )); === this won't work if main program already did === "use binmode STDOUT, ":utf8"" === If main program is "prove" it does not do that, so we assume no one do that. And we should not change layer, as it could break "prove".
On 2016-02-28 14:02:27, vsespb wrote: Show quoted text
> > I wonder if a better solution would be to use an encoding layer on > > stdout.
> > Okay. The bug persists (and fix fixes it) for any valid test, which > prints UTF-8 to console. > > i.e. for this too: > > === > use utf8; > use Test::More; > use Test::Builder (); > binmode $_, ':encoding(UTF-8)' for map { Test::Builder->new->$_ } > qw(output failure_output); > ok 1, "привет"; # Russian message > done_testing; > ===
Maybe this kind of problems will be addressed in the refactored Test::More/Test::Builder --- see https://metacpan.org/release/Test2