Bug #108521 for Lingua-EN-Numbers: I suggest replacing all instances of \d with [0-9]

Fri Nov 06 19:11:33 2015 bkb [...] cpan.org - Ticket created

Subject:

I suggest replacing all instances of \d with [0-9]

Here is a program which illustrates the issue: use warnings; use strict; use utf8; use Lingua::EN::Numbers 'num2en'; my $wide = '１.２３'; my $out = num2en ($wide); print $out; The above program produces the following errors with the current github version of Lingua::EN::Numbers (commit id b7f9c31b1b8e599c7ef2a122c02410316a984341): Argument "\x{ff11}" isn't numeric in addition (+) at lib/Lingua/EN/Numbers.pm line 159. Use of uninitialized value in join or string at lib/Lingua/EN/Numbers.pm line 117. Use of uninitialized value in join or string at lib/Lingua/EN/Numbers.pm line 117. zero point The module, in several places, uses \d to validate numbers and then tries to perform arithmetic on them. However, \d does not validate numbers for use in arithmetic. This is because \d matches a variety of Unicode characters which Perl cannot convert to numbers. For example the above "wide ascii" numbers match \d but they cannot be converted to numbers. This may seem like an unlikely case to you, but for example for Japanese users it is extremely easy to accidentally type in "wide ascii" since the keyboard produces it by default when in "input Japanese" mode. On my Japanese-enabled keyboard I just press one button and it looks like this: Ｌｉｎｇｕａ：：ＥＮ：：Ｎｕｍｂｅｒｓ　２．０２. The simplest solution to this is to remove all instances of \d and replace them with [0-9]. Alternatively you can scan the characters validated against \d and replace them with the numerical digits which Perl recognizes. I don't have a general solution but the method wide2ascii in Lingua::JA::Moji shows how to do this using tr: https://metacpan.org/source/BKB/Lingua-JA-Moji-0.40/lib/Lingua/JA/Moji.pm#L905 Code is this: $input =~ tr/\x{3000}\x{FF01}-\x{FF5E}/ -~/; This converts the wide space U+3000 and wide ascii U+ff01 to U+ff5e into normal ascii. Thanks.

Sat Nov 07 15:27:43 2015 NEILB [...] cpan.org - Correspondence added

Thanks -- fixed (by you :-) in 2.03 Cheers, Neil

Sat Nov 07 15:27:43 2015 The RT System itself - Status changed from 'new' to 'open'

Sat Nov 07 15:27:51 2015 NEILB [...] cpan.org - Status changed from 'open' to 'resolved'

Sat Nov 07 15:27:57 2015 NEILB [...] cpan.org - Taken

Sat Nov 07 15:27:57 2015 NEILB [...] cpan.org - Fixed in 2.03 added

Sat Nov 07 15:27:57 2015 NEILB [...] cpan.org - Fixed in 2.02 deleted