Subject: | Numbers prefixed with s, es, th are matching when they shouldn't |
Date: | Thu, 15 Feb 2018 11:44:26 -0500 |
To: | bug-Lingua-EN-Words2Nums [...] rt.cpan.org |
From: | Jeff Hooper <jdhooper [...] vianet.ca> |
I've been using Lingua::EN::Words2Nums to convert street names like
"First", "Second", "Third" to their numerical values and it has worked
very well for this purpose. However, I noticed that it was also
converting a street named "Esten" to the number 10. Turns out the
regular expression constructed from the %nametosub hash is currently
matching 'es', 's' and 'th' at the beginning and end of the string. As
the comments suggest these patterns were intended to only match the end
of the string to remove pluralization. The patch below should fix this
issue.
@@ -251,9 +251,9 @@
$total=$oldpre=$suffix=$newmult=0;
$mult=1;
- # Work backwards up the string.
+ # Work backwards up the string, but make sure that s, es, th do not
match at the beginning of the word
while (length $_) {
- $nametosub{$1}[0]->($nametosub{$1}[1]) while s/$numregexp$//;
+ $nametosub{$1}[0]->($nametosub{$1}[1]) while !
/^((s)|(es)|(th))$/ && s/$numregexp$// ;
if (length $_) {
if (s/(\d+)(?:st|nd|rd|th)?$//) {
num($1);