Bug #91569 for Encode: [PATCH] decode

Sat Dec 21 20:57:44 2013 $_ = 'spro^^*%*^6ut# [...] &$%*c>#!^!#&!pan.org'; y/a-z.@//cd; print - Ticket created

Subject:

[PATCH] decode_utf8 and non-PVs

$ perl -lMblib -e ' use Encode; print decode_utf8("*main::foo")' *main::foo $ perl -lMblib -e ' use Encode; print decode_utf8(*foo)' The second one-liner prints nothing. Under a different perl version I get gibberish: $ ./perl -lIlib -e ' use Encode; print decode_utf8(*foo)' Wide character in print at -e line 1. *clone_encoding8�@X�� FB_CONSTS@��X�(��X�8a�X��X�B��AX��c�X�p��X�B��AX��X�HY@X��X�HY@X�P��X�HY@X�| The input should be forcibly stringified before using SvCUR and SvEND. See the attached patch. (This also makes it handle copy-on-write correctly, which was how I encountered this. I am currently working on a debug mode for perl that changes COW buffer violations into crashes.)

Subject:

patch.txt

diff -rup Encode-2.55-uZLdOK-orig/Encode.pm Encode-2.55-uZLdOK/Encode.pm --- Encode-2.55-uZLdOK-orig/Encode.pm 2013-09-14 00:52:18.000000000 -0700 +++ Encode-2.55-uZLdOK/Encode.pm 2013-12-21 17:53:01.000000000 -0800 @@ -209,7 +209,7 @@ my $utf8enc; sub decode_utf8($;$) { my ( $octets, $check ) = @_; return undef unless defined $octets; - $octets .= '' if ref $octets; + $octets .= ''; $check ||= 0; $utf8enc ||= find_encoding('utf8'); diff -rup Encode-2.55-uZLdOK-orig/t/Encode.t Encode-2.55-uZLdOK/t/Encode.t --- Encode-2.55-uZLdOK-orig/t/Encode.t 2013-12-21 17:51:53.000000000 -0800 +++ Encode-2.55-uZLdOK/t/Encode.t 2013-12-21 17:52:34.000000000 -0800 @@ -25,7 +25,7 @@ my @character_set = ('0'..'9', 'A'..'Z', my @source = qw(ascii iso8859-1 cp1250); my @destiny = qw(cp1047 cp37 posix-bc); my @ebcdic_sets = qw(cp1047 cp37 posix-bc); -plan test => 38+$n*@encodings + 2*@source*@destiny*@character_set + 2*@ebcdic_sets*256 + 6 + 4; +plan test => 38+$n*@encodings + 2*@source*@destiny*@character_set + 2*@ebcdic_sets*256 + 6 + 5; my $str = join('',map(chr($_),0x20..0x7E)); my $cpy = $str; ok(length($str),from_to($cpy,'iso8859-1','Unicode'),"Length Wrong"); @@ -150,6 +150,9 @@ package main; ok(decode(latin1 => Encode::Dummy->new("foobar")), "foobar"); ok(encode(utf8 => Encode::Dummy->new("foobar")), "foobar"); +# decode_utf8 with non-string arguments +ok(decode_utf8(*1), "*main::1"); + # hash keys my $key = (keys %{{ "whatever\x{100}" => '' }})[0]; my $kopy = $key;

Sat Dec 21 23:11:01 2013 DANKOGAI [...] cpan.org - Correspondence added

Thanks, merged in 2.56 https://github.com/dankogai/p5-encode https://travis-ci.org/dankogai/p5-encode Dan the Maintainer Thereof On Sat Dec 21 20:57:44 2013, SPROUT wrote: Show quoted text

> $ perl -lMblib -e ' use Encode; print decode_utf8("*main::foo")' > *main::foo > $ perl -lMblib -e ' use Encode; print decode_utf8(*foo)' > > > The second one-liner prints nothing. > > Under a different perl version I get gibberish: > > $ ./perl -lIlib -e ' use Encode; print decode_utf8(*foo)' > Wide character in print at -e line 1. > *clone_encoding8�@X�� > FB_CONSTS@��X�(��X�8a�X��X�B��AX��c�X�p��X�B��AX��X�HY@X��X�HY@X�P��X�HY@X�| > > The input should be forcibly stringified before using SvCUR and SvEND. > See the attached patch. > (This also makes it handle copy-on-write correctly, which was how I > encountered this. I am currently working on a debug mode for perl > that changes COW buffer violations into crashes.)

Sat Dec 21 23:11:01 2013 The RT System itself - Status changed from 'new' to 'open'

Sat Dec 21 23:11:02 2013 DANKOGAI [...] cpan.org - Status changed from 'open' to 'resolved'

Bug #91569 for Encode: [PATCH] decode_utf8 and non-PVs