Skip Menu |

This queue is for tickets about the MARC-Charset CPAN distribution.

Report information
The Basics
Id: 63271
Status: resolved
Priority: 0/
Queue: MARC-Charset

People
Owner: Nobody in particular
Requestors: asko.ohmann [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 1.35



Subject: MARC::Charset bug with Extended Cyrillic charset
Date: Tue, 23 Nov 2010 15:58:46 +0200
To: bug-MARC-Charset [...] rt.cpan.org
From: Asko Ohmann <asko.ohmann [...] gmail.com>
Hello, I've been using MARC::Charset module to do some character conversion from marc8 to utf8. I found that the Extended Cyrillic characters gave me an error like: no mapping found for [0x44] at position 12 in sEMDNOWA g0=EXTENDED_CYRILLIC g1=EXTENDED_LATIN at /usr/share/perl5/MARC/Charset.pm line 210. I got around this by adding 128 to the character value. As I understand that should be the g1 value however as stated in the error message Extended Cyrillic is used as g0. Here is an example of code to reproduce the error: #!/usr/bin/perl -w use strict; use MARC::Charset 'marc8_to_utf8'; my $str = chr(0x1B).'(NsEM'.chr(0x1B).'(B'.chr(0x1B).'(QD'.chr(0x1B).'(B'.chr(0x1B).'(NNOWA'.chr(0x1B).'(B'; $str = marc8_to_utf8($str); The string after conversion should read: Семёнова If it should prove relevant I was running the program on Ubuntu Linux 2.6.35-22-generic #35-Ubuntu and the Perl version was v5.10.1 -- Asko Ohmann
On Tue Nov 23 08:59:07 2010, asko.ohmann@gmail.com wrote: Show quoted text
> I've been using MARC::Charset module to do some character conversion > from marc8 to utf8. I found that the Extended Cyrillic characters gave > me an error like: > > no mapping found for [0x44] at position 12 in sEMDNOWA > g0=EXTENDED_CYRILLIC g1=EXTENDED_LATIN > at /usr/share/perl5/MARC/Charset.pm line 210.
Thanks for the report. I have fixed this in version 1.35, which I have just uploaded to CPAN.