Skip Menu |

This queue is for tickets about the Search-Tools CPAN distribution.

Report information
The Basics
Id: 23880
Status: resolved
Priority: 0/
Queue: Search-Tools

People
Owner: Nobody in particular
Requestors: kris [...] koehntopp.de
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: Transliteration for German Umlauts
Date: Mon, 11 Dec 2006 11:19:50 +0100
To: bug-Search-Tools [...] rt.cpan.org
From: Kristian Koehntopp <kris [...] koehntopp.de>
I wanted to use Search::Tools::Transliterate to convert the german band name Die Ärzte ("Die UPPER CASE A-Umlaut rzte") into one of its valid transliterations Die Aerzte Die Arzte with the first variant preferred. Search::Tools::Transliterate has all necessary information, but the initialization gets in the way. This does what I need, but is nonconfigureable. white:/usr/lib/perl5/site_perl/5.8.8/Search/Tools # diff -u Transliterate.pm~ Transliterate.pm --- Transliterate.pm~ 2006-12-11 11:04:01.000000000 +0100 +++ Transliterate.pm 2006-12-11 11:05:41.000000000 +0100 @@ -127,14 +127,15 @@ { chomp; my ($from, $to) = (m/^(<U.+?>)\ (.+)$/); + ($to, undef ) = split(";", $to) if ($to =~ m/;/); $Map{_Utag_to_chr($from)} = _Utag_to_chr($to); } # add/override latin1 -for (128 .. 255) -{ - $Map{chr($_)} = chr($_); -} +#for (128 .. 255) +#{ +# $Map{chr($_)} = chr($_); +#} I suggest that you supply some code that does the above, plus - it can be selected if one wants the override or not - it can be selected if one wants the first or the second variant in ";"-separated $to variants. Kris -- Kristian =?iso-8859-15?q?K=F6hntopp?= <kris@xn--khntopp-90a.de>
From: KARMAN [...] cpan.org
version 0.08 has been uploaded to CPAN. It addresses your issues in 2 ways: 1. only the first value in the transliteration map is used by default. 2. you can set the 'ebit' param to '0' in new() to turn off the 8 bit override in the map. In addition, the map() method now allows you to override character mappings per S::T::T instance. thanks. pek On Mon Dec 11 05:17:05 2006, kris@koehntopp.de wrote: Show quoted text
> > I wanted to use Search::Tools::Transliterate to convert the german > band name > > Die Ärzte ("Die UPPER CASE A-Umlaut rzte") > > into one of its valid transliterations > > Die Aerzte > Die Arzte > > with the first variant preferred. Search::Tools::Transliterate has all > necessary information, but the initialization gets in the way. > > This does what I need, but is nonconfigureable. > > white:/usr/lib/perl5/site_perl/5.8.8/Search/Tools # diff -u > Transliterate.pm~ > Transliterate.pm > --- Transliterate.pm~ 2006-12-11 11:04:01.000000000 +0100 > +++ Transliterate.pm 2006-12-11 11:05:41.000000000 +0100 > @@ -127,14 +127,15 @@ > { > chomp; > my ($from, $to) = (m/^(<U.+?>)\ (.+)$/); > + ($to, undef ) = split(";", $to) if ($to =~ m/;/); > $Map{_Utag_to_chr($from)} = _Utag_to_chr($to); > } > > # add/override latin1 > -for (128 .. 255) > -{ > - $Map{chr($_)} = chr($_); > -} > +#for (128 .. 255) > +#{ > +# $Map{chr($_)} = chr($_); > +#} > > I suggest that you supply some code that does the above, plus > > - it can be selected if one wants the override or not > - it can be selected if one wants the first or the second variant in > ";"-separated $to variants. > > Kris >