Skip Menu |

This queue is for tickets about the Unicode-Collate CPAN distribution.

Report information
The Basics
Id: 61550
Status: resolved
Priority: 0/
Queue: Unicode-Collate

People
Owner: Nobody in particular
Requestors: PHILKIME [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.59-withoutworldwriteables
Fixed in: (no value)



Subject: Can't select German variants for sorting?
I hope that this is not a stupid question but is there a way to select the different German variant locales to sort by? There is, for example,a "DIN 5007" variant of German which sorts differently to standard German. If it helps, I have a module which fully implements locale string parsing as per UTS #35.
Subject: Re: [rt.cpan.org #61550] Can't select German variants for sorting?
Date: Wed, 22 Sep 2010 23:33:29 +0900
To: bug-Unicode-Collate [...] rt.cpan.org
From: SADAHIRO Tomoyuki <bqw10602 [...] nifty.com>
On Wed, 22 Sep 2010 02:09:00 -0400 "Philip Kime via RT" <bug-Unicode-Collate@rt.cpan.org> wrote: Show quoted text
> I hope that this is not a stupid question but is there a way to select > the different German variant locales to sort by? There is, for example,a > "DIN 5007" variant of German which sorts differently to standard German. > > If it helps, I have a module which fully implements locale string > parsing as per UTS #35.
I found http://de.wikipedia.org/wiki/Alphabetische_Sortierung where states: ! Die DIN 5007-1 beschreibt unter dem Titel „Ordnen ! von Schriftzeichenfolgen (ABC-Regeln)“ das Sortieren. ! DIN 5007 Variante 1 (für Wörter verwendet, etwa in Lexika; ! Abschnitt 6.1.1.4.1) ! ä und a sind gleich ! ö und o sind gleich ! ü und u sind gleich ! ß und ss sind gleich ! DIN 5007 Variante 2 (spezielle Sortierung für Namenslisten, ! etwa in Telefonbüchern; Abschnitt 6.1.1.4.2) ! ä und ae sind gleich ! ö und oe sind gleich ! ü und ue sind gleich ! ß und ss sind gleich It seems Variante 1 is default and Variante 2 is like CLDR's German phonebook locale but the latter is currently broken. (see http://unicode.org/cldr/trac/ticket/2833 German collation) Then I'm waiting for the release of CDLR 1.9. (note: the Public Review issue on CDLR 1.9 is open http://www.unicode.org/review/pr-175.html ) SADAHIRO Tomoyuki
It looks like you implemented Variant 2 in version 0.60 - thank you. Any chance of implementing Variant 1 too? It looks like a very simple variant ... Maybe it's possible also to define aliases for the locales like de__DIN5007-1 de__DIN5007-2 ? Or perhaps use the official CLDR format for locales with variants from http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers PK On Wed Sep 22 10:33:54 2010, bqw10602@nifty.com wrote: Show quoted text
> Then I'm waiting for the release of CDLR 1.9. > (note: the Public Review issue on CDLR 1.9 is open > http://www.unicode.org/review/pr-175.html )
Subject: Re: [rt.cpan.org #61550] Can't select German variants for sorting?
Date: Tue, 28 Sep 2010 23:37:00 +0900
To: bug-Unicode-Collate [...] rt.cpan.org
From: SADAHIRO Tomoyuki <bqw10602 [...] nifty.com>
As shown in below, http://cpansearch.perl.org/src/SADAHIRO/Unicode-Collate-0.60/t/loc_de.t Unicode::Collate::Locale->new(locale => 'DE') should do it, as an alias for default (no tailoring, only according to DUCET). Apparently there is no difference between that the language needs no tailoring and that tailoring for the language is not supported. SADAHIRO Show quoted text
> Queue: Unicode-Collate > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=61550 > > > It looks like you implemented Variant 2 in version 0.60 - thank you. Any > chance of implementing Variant 1 too? It looks like a very simple > variant ... > > Maybe it's possible also to define aliases for the locales like > de__DIN5007-1 > de__DIN5007-2 > > ? Or perhaps use the official CLDR format for locales with variants from > > http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers > > PK > > On Wed Sep 22 10:33:54 2010, bqw10602@nifty.com wrote: >
> > Then I'm waiting for the release of CDLR 1.9. > > (note: the Public Review issue on CDLR 1.9 is open > > http://www.unicode.org/review/pr-175.html )
I see, "level =>1" should do this because DIN5007-1 is basically "no diacritics". However, I think that without a special tailoring, there is a problem because then it is impossible to do case- sensitive sorting with DIN5007-1 as this would need "level => 3"?