Bug #38912 for MARC-Charset: double diacritics

Tue Sep 02 09:00:45 2008 tventimi [...] princeton.edu - Ticket created

Subject:	double diacritics
Date:	Tue, 2 Sep 2008 09:00:16 -0400
To:	bug-MARC-Charset [...] rt.cpan.org
From:	"Thomas P. Ventimiglia" <tventimi [...] princeton.edu>

Greetings: MARC-Charset is a great package, but I recently noticed a small problem regarding the conversion of diacritics that span two characters. In MARC8, there are two of these, the ligature and double tilde. Each of these is implemented as a pair of combing diacritcs, a "left half" and "right half" (0xEB and 0xEC for the ligature, 0xFA and 0xFB for the tilde). There are two different ways of converting these to Unicode. They may be converted directly to the combining half marks 0xFE20...0xFE23, or the two half marks may be replaced with one of the "double" diactrics, which is placed between the two characters it spans (0x0361 for ligature, 0x0360 for tilde). However, MARC-Charset does not do either of these. Instead, it replaces the left half with the double diacritic mark, and the right half with the Unicode right half mark. I am using version 1.0 of the module with Perl 5.8.8 on Red Hat Enterprise Linux 2.6.18-92.1.10.el5. Thank you for your help. --- Thomas Ventimiglia Computer Systems Specialist Princeton University East Asian Library

Tue Sep 02 09:06:41 2008 tventimi [...] princeton.edu - Correspondence added

Subject:	[rt.cpan.org #38912] double diacritics
Date:	Tue, 2 Sep 2008 09:06:30 -0400
To:	bug-MARC-Charset [...] rt.cpan.org
From:	"Thomas P. Ventimiglia" <tventimi [...] princeton.edu>

Please see the attached files. doublediacritics.txt is a MARC8-encoded file containing the two diacritics in question. doubleresult-marccharset.txt is the UTF8-conversion produced by MARC-Charset, and doubleresult-correct.txt is the correct UTF8-conversion. Tom On Tue, Sep 2, 2008 at 9:00 AM, Bugs in MARC-Charset via RT <bug-MARC-Charset@rt.cpan.org> wrote: Show quoted text

> > Greetings, > > This message has been automatically generated in response to the > creation of a trouble ticket regarding: > "double diacritics", > a summary of which appears below. > > There is no need to reply to this message right now. Your ticket has been > assigned an ID of [rt.cpan.org #38912]. Your ticket is accessible > on the web at: > > http://rt.cpan.org/Ticket/Display.html?id=38912 > > Please include the string: > > [rt.cpan.org #38912] > > in the subject line of all future correspondence about this issue. To do so, > you may reply to this message. > > Thank you, > bug-MARC-Charset@rt.cpan.org > > ------------------------------------------------------------------------- > Greetings: > > MARC-Charset is a great package, but I recently noticed a small > problem regarding the conversion of diacritics that span two > characters. In MARC8, there are two of these, the ligature and double > tilde. Each of these is implemented as a pair of combing diacritcs, a > "left half" and "right half" (0xEB and 0xEC for the ligature, 0xFA and > 0xFB for the tilde). There are two different ways of converting these > to Unicode. They may be converted directly to the combining half > marks 0xFE20...0xFE23, or the two half marks may be replaced with one > of the "double" diactrics, which is placed between the two characters > it spans (0x0361 for ligature, 0x0360 for tilde). However, > MARC-Charset does not do either of these. Instead, it replaces the > left half with the double diacritic mark, and the right half with the > Unicode right half mark. > > I am using version 1.0 of the module with Perl 5.8.8 on Red Hat > Enterprise Linux 2.6.18-92.1.10.el5. > > Thank you for your help. > > --- > Thomas Ventimiglia > Computer Systems Specialist > Princeton University East Asian Library > >

Message body is not shown because sender requested not to inline it.

Sat Aug 06 16:19:13 2011 GMCHARLT [...] cpan.org - Taken

Sat Aug 06 16:19:45 2011 GMCHARLT [...] cpan.org - Status changed from 'new' to 'patched'

Sat Aug 06 16:19:45 2011 GMCHARLT [...] cpan.org - Severity Normal added

Sat Aug 06 16:19:45 2011 GMCHARLT [...] cpan.org - Broken in 1.33 added

Sat Aug 06 16:22:04 2011 GMCHARLT [...] cpan.org - Correspondence added

Thank you for the bug report. This is fixed in the rt38912 branch of the MARC/Perl Git repository (clone git://marcpm.git.sourceforge.net/gitroot/marcpm/marcpm, gitweb http://marcpm.git.sourceforge.net/git/gitweb.cgi? p=marcpm/marcpm;a=shortlog;h=refs/heads/rt38912) if you care to test. I will be making a new MARC::Charset release in the next day or two.

Sat Aug 06 16:22:04 2011 The RT System itself - Status changed from 'patched' to 'open'

Sat Aug 06 16:24:03 2011 GMCHARLT [...] cpan.org - Status changed from 'open' to 'patched'

Mon Aug 08 08:32:48 2011 tventimi [...] princeton.edu - Correspondence added

Subject:	Re: [rt.cpan.org #38912] double diacritics
Date:	Mon, 8 Aug 2011 08:32:19 -0400
To:	bug-MARC-Charset [...] rt.cpan.org
From:	"Thomas P. Ventimiglia" <tventimi [...] princeton.edu>

Thank you. Tom On Sat, Aug 6, 2011 at 4:22 PM, Galen Charlton via RT <bug-MARC-Charset@rt.cpan.org> wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=38912 > > > Thank you for the bug report. This is fixed in the rt38912 branch of the > MARC/Perl Git repository (clone > git://marcpm.git.sourceforge.net/gitroot/marcpm/marcpm, gitweb > http://marcpm.git.sourceforge.net/git/gitweb.cgi? > p=marcpm/marcpm;a=shortlog;h=refs/heads/rt38912) if you care to test. I > will be making a new MARC::Charset release in the next day or two. >

Mon Aug 08 08:32:50 2011 The RT System itself - Status changed from 'patched' to 'open'

Tue Aug 13 23:05:11 2013 GMCHARLT [...] cpan.org - Correspondence added

On Sat Aug 06 16:22:04 2011, GMCHARLT wrote: Show quoted text

> Thank you for the bug report. This is fixed in the rt38912 branch of the > MARC/Perl Git repository (clone > git://marcpm.git.sourceforge.net/gitroot/marcpm/marcpm, gitweb > http://marcpm.git.sourceforge.net/git/gitweb.cgi? > p=marcpm/marcpm;a=shortlog;h=refs/heads/rt38912) if you care to test. I > will be making a new MARC::Charset release in the next day or two.

The fix was released in version 1.34.

Tue Aug 13 23:05:12 2013 GMCHARLT [...] cpan.org - Status changed from 'open' to 'resolved'

Tue Aug 13 23:05:12 2013 GMCHARLT [...] cpan.org - Fixed in 1.34 added