Subject: | Text::Brew treats UTF-8 chars as a sequence of bytes |
When finding the editing distance between the following strings:
vuoddu vuođđu
(the second string should contain two consecutive instances of d-stroke (0x0111) between
the vowels o and u)
Text::Brew reports a distance of 4, instead of the expected 2.
The test pair is from Northern Sámi.
perl -v:
This is perl, v5.8.6 built for darwin-thread-multi-2level
uname -a:
Darwin a84-231-7-118.elisa-laajakaista.fi 8.6.0 Darwin Kernel Version 8.6.0: Tue Mar 7
16:58:48 PST 2006; root:xnu-792.6.70.obj~1/RELEASE_PPC Power Macintosh powerpc
(aka MacOS X 10.4.6)