Subject: | Need explicit documentation about characters vs. octets - or, the unicode problem |
Bugs like https://rt.cpan.org/Ticket/Display.html?id=92497 show that it's pretty easy to pass a wide-char-containing string along to Authen::Passphrase without realizing that, somewhere along the way, it needs to be utf8_encoded first.
Since crypt (in perlfunc) is documented to require octets, it would seem that e.g. Crypt::Eksblowfish::Bcrypt::bcrypt_hash should also take octets, but it's unclear at what point before that the utf8 encoding should be done. Should $authen_passphrase_obj->match(PASSPHRASE) take octets, or characters? i.e. should Authen::Passphrase::* do the encoding, or should the caller? Either way, it needs to be clearly documented what layer is doing the encoding and whether callers should be passing unencoded characters, or encoded octets.