Subject: | Tests fail in t/locale.t: 54: Russian unformat_number handling is not robust |
As reported in RT 46660, I'm seeing the following failures with 1.73:
make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/format_bytes.......ok
t/format_negative....ok
t/format_number......ok
t/format_picture.....ok
t/format_price.......ok
t/locale.............NOK 6# Failed test (t/locale.t at line 54)
# got: '12345679'
# expected: '123456.79'
t/locale.............ok 11/0# Looks like you failed 1 tests of 11.
t/locale.............dubious
Test returned status 1 (wstat 256, 0x100)
DIED. FAILED test 6
Failed 1/11 tests, 90.91% okay
t/object.............ok
t/round..............ok
t/unformat_number....ok
Failed Test Stat Wstat Total Fail Failed List of Failed
-------------------------------------------------------------------------------
t/locale.t 1 256 11 1 9.09% 6
Failed 1/9 test scripts, 88.89% okay. 1/153 subtests failed, 99.35% okay.
After some investigation, I determined the problem only occurs when the
thousands separator and the monetary decimal point are the same (and an
instance of one of them occurs in the number). Thus, the given string
of (assuming the separator is '.'):
"123.456.79" cannot be parsed. The unformat_number function fails
because there is no else clause in the following conditional:
1144 # ru_RU locale has comma for decimal_point, but period for
1145 # mon_decimal_point! But as long as thousands_sep is different
1146 # from either, we can allow either decimal point.
1147 if ($self->{mon_decimal_point} &&
1148 $self->{decimal_point} ne $self->{mon_decimal_point} &&
1149 $self->{decimal_point} ne $self->{mon_thousands_sep} &&
1150 $self->{mon_decimal_point} ne $self->{thousands_sep})
1151 {
1152 $pt = qr/(?:\Q$self->{decimal_point}\E|
1153 \Q$self->{mon_decimal_point}\E)/x;
1154 }
i.e., the conditions are not true, so that number is not parsed.
A short-term solution might be to add a die handler: e.g.,
...
} else {
die "I don't know what to do with Russian when the thousands
separator and monetary decimal point are both '",
$self->{mon_decimal_point}, "'";
}
Another alternative would be to make the parsing more robust (there
would still be ambiguous cases, of course, but numbers like 1,234,56
could be parsed). A more robust solution would be to use objects
instead of overloading strings.