Skip Menu |

This queue is for tickets about the Text-Diff CPAN distribution.

Report information
The Basics
Id: 54214
Status: resolved
Priority: 0/
Queue: Text-Diff

People
Owner: Nobody in particular
Requestors: SHLOMIF [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.37
Fixed in: (no value)



Subject: Test-Differences Always Outputs UTF-8 text as ugly \x{....} escapes
When running the following test script (also attached): <<< #!/usr/bin/perl use strict; use warnings; use utf8; use Test::More tests => 1; use Test::Differences; # So we can output the text from the tests as UTF-8 binmode(STDOUT, ":encoding(utf-8)"); binmode(STDERR, ":encoding(utf-8)"); # TEST eq_or_diff( <<"EOF", Hello שלוש EOF <<"EOF", Hello שלום EOF ); Show quoted text
>>>
I'm getting the following output on the console: <<<<<<<<<<<< test-diff.t .. 1/1 # Failed test at test-diff.t line 16. # +---+----------------------------------+----------------------------------+ # | Ln|Got |Expected | # +---+----------------------------------+----------------------------------+ # | 1|Hello |Hello | # * 2|\x{05e9}\x{05dc}\x{05d5}\x{05e9} |\x{05e9}\x{05dc}\x{05d5}\x{05dd} * # +---+----------------------------------+----------------------------------+ # Looks like you failed 1 test of 1. test-diff.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests Test Summary Report ------------------- test-diff.t (Wstat: 256 Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=1, Tests=1, 1 wallclock secs ( 0.07 usr 0.02 sys + 0.13 cusr 0.02 csys = 0.24 CPU) Result: FAIL Show quoted text
>>>>>>>>>>>>
I'd like to get rid of this ugly \x{...} and see the actual Hebrew characters. Please fix it appropriately. Regards, -- Shlomi Fish
Subject: test-diff.t
#!/usr/bin/perl use strict; use warnings; use utf8; use Test::More tests => 1; use Test::Differences; # So we can output the text from the tests as UTF-8 binmode(STDOUT, ":encoding(utf-8)"); binmode(STDERR, ":encoding(utf-8)"); # TEST eq_or_diff( <<"EOF", Hello שלוש EOF <<"EOF", Hello שלום EOF );
Hi Shlomi, Thank you for the bug report. I hate to ask, but is there any chance you can supply a patch? I just don't know enough about utf8 to be sure of the best way to resolve this. Plus, displaying as utf8 characters needs to be optional. Some utf8 characters look very similar on various fonts, so people need to be able to see both the utf8 character and the \x{} version. Cheers, Ovid
On Wed Feb 03 03:29:51 2010, OVID wrote: Show quoted text
> Hi Shlomi, > > Thank you for the bug report. I hate to ask, but is there any chance you > can supply a patch? I just don't know enough about utf8 to be sure of the > best way to resolve this.
I can try to supply a patch, but I need some guidance - where exactly does Test-Differences encode the special characters as \x{...} ? I couldn't find it from a cursery glance. Show quoted text
> > Plus, displaying as utf8 characters needs to be optional. Some utf8 > characters look very similar on various fonts, so people need to be
able to Show quoted text
> see both the utf8 character and the \x{} version. >
Sure, I will base it on an environment variable. Regards, -- Shlomi Fish Show quoted text
> Cheers, > Ovid
Hi! Any news about it? That was over a month ago. Regards, -- Shlomi Fish On Wed Feb 03 06:48:14 2010, SHLOMIF wrote: Show quoted text
> On Wed Feb 03 03:29:51 2010, OVID wrote:
> > Hi Shlomi, > > > > Thank you for the bug report. I hate to ask, but is there any chance
you Show quoted text
> > can supply a patch? I just don't know enough about utf8 to be sure
of the Show quoted text
> > best way to resolve this.
> > I can try to supply a patch, but I need some guidance - where exactly > does Test-Differences encode the special characters as \x{...} ? I > couldn't find it from a cursery glance. >
> > > > Plus, displaying as utf8 characters needs to be optional. Some utf8 > > characters look very similar on various fonts, so people need to be
> able to
> > see both the utf8 character and the \x{} version. > >
> > Sure, I will base it on an environment variable. > > Regards, > > -- Shlomi Fish >
> > Cheers, > > Ovid
>
Subject: Re: [rt.cpan.org #54214] Test-Differences Always Outputs UTF-8 text as ugly \x{....} escapes
Date: Sat, 27 Mar 2010 04:45:51 -0700 (PDT)
To: bug-Text-Diff [...] rt.cpan.org
From: Ovid <curtis_ovid_poe [...] yahoo.com>
--- On Sat, 27/3/10, Shlomi Fish via RT <bug-Text-Diff@rt.cpan.org> wrote: Show quoted text
> From: Shlomi Fish via RT <bug-Text-Diff@rt.cpan.org>
Show quoted text
> > Hi! > > Any news about it? That was over a month ago.
Hi Shlomi, Sorry about this. I've got my hands seriously full with another project and work, so this has been fairly low priority. I'd be happy to apply patches, though. Cheers, Ovid -- Buy the book - http://www.oreilly.com/catalog/perlhks/ Tech blog - http://blogs.perl.org/users/ovid/ Twitter - http://twitter.com/OvidPerl Official Perl 6 Wiki - http://www.perlfoundation.org/perl6
Hi all! I found a workaround for this bug. If you add the following after "use Test::Differences;", then everything works fine: <<< use Text::Diff::Table; sub Text::Diff::Table::escape($) { return shift; } Show quoted text
>>>
I could write something that toggles it based on an environment variable and a Perl global variable, but this seems to work for me. Regards, -- Shlomi Fish
Hi! Here is the patch against the svn trunk. It includes some rudimentary tests. Please apply it. I hereby disclaim any owernship of the changes and/or license them under Public Domain / CC-Zero / MIT/X11. Regards, -- Shlomi Fish
Subject: 54214-unicode-output.diff
Index: t/unicode.t =================================================================== --- t/unicode.t (revision 0) +++ t/unicode.t (revision 0) @@ -0,0 +1,58 @@ +#!/usr/bin/perl + +use strict; + +BEGIN +{ + $ENV{'DIFF_OUTPUT_UNICODE'} = 1; +} + +use Test::More; +use Text::Diff; + +eval "use Encode;"; + +if ($@) +{ + plan skip_all => "No utf8."; +} +else +{ + plan tests => 3; +} + +sub u +{ + return decode("utf-8", shift); +} + +sub ind +{ + my $s = u(shift(@_)); + return index( diff( \(u("שלום"), u("שלוש")), { STYLE => "Table" } ), $s ); +} + +# TEST +ok ( + (ind("ש") >= 0), + "Output in unicode." +); + +{ + local $Text::Diff::Config::Output_Unicode = 0; + # To settle use warnings; + $Text::Diff::Config::Output_Unicode = 0+0; + + # TEST + ok ( + (ind ("\\x{05e9}" ) >= 0), + "Output not in unicode." + ); + + # TEST + ok ( + (ind( "ש" ) < 0 ), + "Output not in unicode - no unicode char found." + ); +} + Index: lib/Text/Diff/Config.pm =================================================================== --- lib/Text/Diff/Config.pm (revision 0) +++ lib/Text/Diff/Config.pm (revision 0) @@ -0,0 +1,70 @@ +package Text::Diff::Config; + +use strict; +use warnings; + +use vars qw($Output_Unicode); + +BEGIN +{ + $Output_Unicode = $ENV{'DIFF_OUTPUT_UNICODE'}; +} + +1; + +__END__ + +=pod + +=head1 NAME + +Text::Diff::Config - global configuration for Text::Diff (as a +separate module). + +=head1 SYNOPSIS + + use Text::Diff::Config; + + $Text::Diff::Config::Output_Unicode = 1; + +=head1 DESCRIPTION + +This module configures Text::Diff and its related modules. Currently it contains +only one global variable $Text::Diff::Config::Output_Unicode which is a boolean +flag, that if set outputs unicode characters as themselves without escaping them +as C< \x{HHHH} > first. + +It is initialized to the value of C< $ENV{DIFF_OUTPUT_UNICODE} >, but can be +set to a different value at run-time, including using local. + +=head1 AUTHOR + +Shlomi Fish, L<http://www.shlomifish.org/> . + +=head1 LICENSE + +Copyright 2010, Shlomi Fish. + +This file is licensed under the MIT/X11 License: +L<http://www.opensource.org/licenses/mit-license.php>. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +=cut + Property changes on: lib/Text/Diff/Config.pm ___________________________________________________________________ Added: svn:eol-style + native Index: lib/Text/Diff/Table.pm =================================================================== --- lib/Text/Diff/Table.pm (revision 11778) +++ lib/Text/Diff/Table.pm (working copy) @@ -3,6 +3,8 @@ use 5.00503; use strict; use Carp; +use Text::Diff::Config; + use vars qw{$VERSION @ISA @EXPORT_OK}; BEGIN { $VERSION = '1.37'; @@ -59,10 +61,13 @@ sub escape($) { use utf8; join "", map { + my $c = $_; $_ = ord; exists $escapes{$_} ? $escapes{$_} - : sprintf( "\\x{%04x}", $_ ); + : $Text::Diff::Config::Output_Unicode + ? $c + : sprintf( "\\x{%04x}", $_ ); } split //, shift; } @@ -378,6 +383,13 @@ Whether or not line 3 should have that tab character escaped is a judgement call; so far I'm choosing not to. +=head1 UNICODE + +To output the raw unicode chracters consult the documentation of +L<Text::Diff::Config>. You can set the C<DIFF_OUTPUT_UNICODE> environment +variable to 1 to output it from the command line. For more information, +consult this bug: L<https://rt.cpan.org/Ticket/Display.html?id=54214> . + =head1 LIMITATIONS Table formatting requires buffering the entire diff in memory in order to
On Sat Mar 27 09:42:35 2010, SHLOMIF wrote: Show quoted text
> Hi! > > Here is the patch against the svn trunk. It includes some rudimentary > tests. Please apply it. > > I hereby disclaim any owernship of the changes and/or license them under > Public Domain / CC-Zero / MIT/X11. >
Hi! Please apply the patch. Regards, -- Shlomi Fish Show quoted text
> Regards, > > -- Shlomi Fish
Hi Shlomi, I've applied the patch and uploaded it. It should hit the CPAN soon and I'm adding it to the documentation of Test::Differences. Sorry it took so long to get to. Cheers, Curtis