CC: | perl5-porters [...] perl.org, bug-Encode [...] rt.cpan.org |
Subject: | [perl #107326] perl's unicode conversion fails when iconv succeeds |
Date: | Fri, 30 Dec 2011 11:00:23 -0800 |
To: | "OtherRecipients of perl Ticket #107326":; |
From: | "Father Chrysostomos via RT" <perlbug-followup [...] perl.org> |
On Fri Dec 30 10:41:46 2011, LAWalsh wrote:
Show quoted text
>
> This is a bug report for perl from perl-diddler@tlinx.org,
> generated with the help of perlbug 1.39 running under perl 5.12.3.
>
>
> -----------------------------------------------------------------
> [Please describe your issue here]
>
> Was looking at ways to do upper/lower case compare, and bumped into
> piconv as being a 'drop in replacement for "iconv"'. So I decided to try
> it thinking it would be a 'hoot' if it was faster.
>
> Rather than faster, it choked at the beginning of my 98M test file
> (i.e. I truncated the file to the first few lines, 672 bytes), which
> reproduces the problem just fine .. Tr�s sad...
>
You‘re right:
$ piconv5.15.6 -f utf16 -t utf-8 /Users/sprout/Downloads/test.in
UTF-16:Unrecognised BOM d at
/usr/local/lib/perl5/5.15.6/darwin-thread-multi-2level/Encode.pm line
196, <$ifh> line 2.
The file begins with <FF><FE>.
If I use utf-16le explicitly, it does the first line correctly, but
quickly switches to Chinese, which means it’s off by one byte. If I use
utf-16be explicitly, the first line is in Chinese.
This is part of the Encode distribution, for which CPAN is upstream, so
I’m forwarding this to the CPAN ticket.
--
Father Chrysostomos
---
via perlbug: queue: perl5 status: new
https://rt.perl.org:443/rt3/Ticket/Display.html?id=107326
Message body not shown because it is not plain text.