Skip Menu |

This queue is for tickets about the Text-Unidecode CPAN distribution.

Report information
The Basics
Id: 30501
Status: rejected
Priority: 0/
Queue: Text-Unidecode

People
Owner: Nobody in particular
Requestors: labassistant [...] nese.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Incorrect transliteration of \x{8e}
Date: Mon, 5 Nov 2007 14:47:02 -0500
To: bug-Text-Unidecode [...] rt.cpan.org
From: "Gavin Bisesi" <labassistant [...] nese.com>
Download perlv
application/octet-stream 2.7k

Message body not shown because it is not plain text.

In Text::Unidecode v0.04, \x{8e} (é) is transliterated as an empty string rather than as "e". Output of "perl -V" is in the attachment.
From: SBURKE [...] cpan.org
\x{8e} is correctly transliterated as empty-string, because \x{8e} is not "é" in Unicode; it is nothing, thence, nothing. Text::Unidecode requires that the input be in Unicode. You're apparently forgetting to apply the Encoding filter that would translate your non-Unicode encoding, into Unicode. (My spidey sense is tingling and telling me your encoding is the old-timey encoding MacAscii, but that's just a guess.)
Subject: Re: [rt.cpan.org #30501] Incorrect transliteration of \x{8e}
Date: Wed, 7 Nov 2007 12:37:07 -0500
To: bug-Text-Unidecode [...] rt.cpan.org
From: labassistant <labassistant [...] nese.com>
Thanks very much, and you're right, I am on a mac (macperl 5.8.6; OS X 10.4, primarily). How do I change that string to unicode? Would I use Encode::encode_utf8()? Thanks for the timely reply. On Nov 6, 2007, at 7:06 PM, via RT wrote: Show quoted text
> > <URL: http://rt.cpan.org/Ticket/Display.html?id=30501 > > > \x{8e} is correctly transliterated as empty-string, because \x{8e} is > not "é" in Unicode; it is nothing, thence, nothing. > Text::Unidecode requires that the input be in Unicode. > > You're apparently forgetting to apply the Encoding filter that would > translate your non-Unicode encoding, into Unicode. > > (My spidey sense is tingling and telling me your encoding is the > old-timey encoding MacAscii, but that's just a guess.) > >