Subject: | Feature request: 8-bit clean string sorts |
My reading of the POD is that you have to use either "fixed" or
"varying" if your input strings may contain NULLs. That means that
Sort::Maker sorts are not as 8-bit clean as native Perl sorts, for example:
my @strings = (
"f\0oo\0\0",
"f\0oo",
"f\0oo\0\0\0\0\0",
"f\0oo\0",
"f\0oo\0\0\0\0",
"f\0oo\0\0\0",
);
Perl's native sort will sort those shortest first as one would expect,
but Sort::Maker would see them all as equivalent under either "fixed" or
"varying", because of the way it pads with NULLs.
Suggested fix: add an encode_nulls option which causes NULL bytes in
string keys to be encoded. One way to encode NULLs while preserving
lexical order is to replace byte 0 in the input with the pair of bytes
(1,1) and replace byte 1 in the input with the pair of bytes (1,2).