From: | Bjoern Hoehrmann <derhoermi [...] gmx.net> |
To: | bug-Encode [...] rt.cpan.org |
Subject: | Encode::utf8::decode_xs does not check partial chars |
Date: | Thu, 21 Oct 2004 21:36:14 +0200 |
Hi,
% perl -MEncode -e "print decode(q(utf-8), qq(Bj\xF6rn))"
does not work as expected (it should print "Bj\x{FFFD}rn") which is
apparently due to Encode::utf8::decode_xs(), the code
...
if ((s + skip) > e) {
/* Partial character - done */
break;
}
...
causes the routine to assume that the octets following that "partial"
character are well-formed UTF-8, but this should not be assumed as it
causes the unexpected behavior above.