Skip Menu |

This queue is for tickets about the Lingua-EN-Numbers CPAN distribution.

Report information
The Basics
Id: 108521
Status: resolved
Priority: 0/
Queue: Lingua-EN-Numbers

People
Owner: NEILB [...] cpan.org
Requestors: bkb [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 2.02
Fixed in: 2.03



Subject: I suggest replacing all instances of \d with [0-9]
Here is a program which illustrates the issue: use warnings; use strict; use utf8; use Lingua::EN::Numbers 'num2en'; my $wide = '1.23'; my $out = num2en ($wide); print $out; The above program produces the following errors with the current github version of Lingua::EN::Numbers (commit id b7f9c31b1b8e599c7ef2a122c02410316a984341): Argument "\x{ff11}" isn't numeric in addition (+) at lib/Lingua/EN/Numbers.pm line 159. Use of uninitialized value in join or string at lib/Lingua/EN/Numbers.pm line 117. Use of uninitialized value in join or string at lib/Lingua/EN/Numbers.pm line 117. zero point The module, in several places, uses \d to validate numbers and then tries to perform arithmetic on them. However, \d does not validate numbers for use in arithmetic. This is because \d matches a variety of Unicode characters which Perl cannot convert to numbers. For example the above "wide ascii" numbers match \d but they cannot be converted to numbers. This may seem like an unlikely case to you, but for example for Japanese users it is extremely easy to accidentally type in "wide ascii" since the keyboard produces it by default when in "input Japanese" mode. On my Japanese-enabled keyboard I just press one button and it looks like this: Lingua::EN::Numbers 2.02. The simplest solution to this is to remove all instances of \d and replace them with [0-9]. Alternatively you can scan the characters validated against \d and replace them with the numerical digits which Perl recognizes. I don't have a general solution but the method wide2ascii in Lingua::JA::Moji shows how to do this using tr: https://metacpan.org/source/BKB/Lingua-JA-Moji-0.40/lib/Lingua/JA/Moji.pm#L905 Code is this: $input =~ tr/\x{3000}\x{FF01}-\x{FF5E}/ -~/; This converts the wide space U+3000 and wide ascii U+ff01 to U+ff5e into normal ascii. Thanks.
Thanks -- fixed (by you :-) in 2.03 Cheers, Neil