Subject: | losing string value of semi-numeric string |
Date: | Mon, 2 Feb 2015 10:33:01 +0000 |
To: | bug-Sereal-Encoder [...] rt.cpan.org |
From: | Zefram <zefram [...] fysh.org> |
$ perl -MSereal::Encoder=encode_sereal -MSereal::Decoder=decode_sereal -lwe 'print $]; print $Sereal::Encoder::VERSION; my $a="0 but true"; print decode_sereal(encode_sereal($a)); my $b = $a+0; print $a; print decode_sereal(encode_sereal($a));'
5.018002
3.005
0 but true
0 but true
0
I believe the first encoding is representing $a as a string but the
second encoding is representing it as a pure integer, based on the IOK
flag. In the case of this string, along with infinitely many others
such as "00", "01", and "1 ", the integer representation is lossy.
It's particularly significant for strings such as "0 but true" and "00"
which qualify as true but come out as false when mangled by the lossy
encoding. But even when the truth value doesn't change, it is not at
all acceptable to lose the string value.
The underlying mistake is that you've treated the IOK flag as implying
that the scalar is fully characterised by its IV. In general that is
not the case. For scalars that are both IOK and POK, to see whether
integer representation suffices you need to perform the IV->PV coercion
yourself, and see whether the PV generated from the IV matches the
scalar's actual PV. Similar remarks apply to NOK and NV. For extra fun,
the exact meaning of the [PIN]OK flags varies between Perl versions.
-zefram