Skip Menu |

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 61456
Status: resolved
Priority: 0/
Queue: Encode

People
Owner: Nobody in particular
Requestors: dwheeler [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Encode 2.40's decode() Looses Data
Date: Sun, 19 Sep 2010 00:52:33 -0700
To: bug-encode [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
With Encode 2.40, this script: use Encode; my @foo = ('Some UTF-8: ö'); for (@foo) { print STDERR "Before: $_\n"; Encode::decode_utf8($_, Encode::FB_CROAK()); print STDERR "After: $_\n"; } Outputs: Before: Some UTF-8: ö After: Note how decode_utf8() has blown away the value. This did not happen with Encode 2.39. I just downgraded and now the script properly outputs: Before: Some UTF-8: ö After: Some UTF-8: ö Thanks, David
DWHEELER, That's the way it's supposed to be. When $check is set and $check | Encode::LEAVE_SRC is not true, it modifies its argument so you can track what is causing the error. use Encode; my @foo = ('Some UTF-8: ö'); for (@foo) { print STDERR qq(Before: $_ eq '$_'\n); my $decoded = Encode::decode_utf8( $_, Encode::FB_CROAK | Encode::LEAVE_SRC ); print STDERR qq(After: $_ eq '$_'\n); print STDERR encode_utf8 qq(\$decoded eq '$decoded'\n); } In other words, decode_utf8 was buggy till 2.40. Dan the Maintainer Thereof On Sun Sep 19 03:52:43 2010, DWHEELER wrote: Show quoted text
> With Encode 2.40, this script: > > use Encode; > my @foo = ('Some UTF-8: ö'); > for (@foo) { > print STDERR "Before: $_\n"; > Encode::decode_utf8($_, Encode::FB_CROAK()); > print STDERR "After: $_\n"; > } > > Outputs: > > Before: Some UTF-8: ö > After: > > Note how decode_utf8() has blown away the value. This did not happen > with Encode 2.39. I just downgraded and now the script properly > outputs: > > Before: Some UTF-8: ö > After: Some UTF-8: ö > > Thanks, > > David >
Subject: Re: [rt.cpan.org #61456] Encode 2.40's decode() Looses Data
Date: Mon, 20 Sep 2010 09:47:03 -0700
To: bug-Encode [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Sep 19, 2010, at 4:08 AM, Dan Kogai via RT wrote: Show quoted text
> In other words, decode_utf8 was buggy till 2.40.
Okay. Adding the Encode::LEAVE_SRC partially fixes the issue in HTTP::Message, where I first ran into this issue. Here's the pull request I sent to Gisle: http://github.com/gisle/libwww-perl/pull/2 Any idea what else Gisle might need to change to get back to detecting the proper encoding? Thanks, David
Subject: Re: [rt.cpan.org #61456] Encode 2.40's decode() Looses Data
Date: Mon, 20 Sep 2010 10:08:47 -0700
To: bug-Encode [...] rt.cpan.org
From: "David E. Wheeler" <david [...] kineticode.com>
On Sep 20, 2010, at 9:47 AM, David E. Wheeler wrote: Show quoted text
> On Sep 19, 2010, at 4:08 AM, Dan Kogai via RT wrote: >
>> In other words, decode_utf8 was buggy till 2.40.
> > Okay. Adding the Encode::LEAVE_SRC partially fixes the issue in HTTP::Message, where I first ran into this issue. Here's the pull request I sent to Gisle: > > http://github.com/gisle/libwww-perl/pull/2 > > Any idea what else Gisle might need to change to get back to detecting the proper encoding?
OOps, my bad. I fixed it and sent a new pull request to Gisle here: http://github.com/gisle/libwww-perl/pull/3 Thanks! David
I've applied the patch an uploaded libwww-perl-5.837 to CPAN.
Subject: Re: [rt.cpan.org #61456] Encode 2.40's decode() Looses Data
Date: Mon, 20 Sep 2010 14:34:37 -0700
To: bug-Encode [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Sep 20, 2010, at 2:27 PM, Gisle_Aas via RT wrote: Show quoted text
> I've applied the patch an uploaded libwww-perl-5.837 to CPAN.
Awesome, thanks Gisle! David