Subject: | Wrong Levenshtein distance reported |
Date: | Fri, 16 Jan 2009 16:13:59 +1100 |
To: | bug-Text-Levenshtein [...] rt.cpan.org |
From: | "James King" <jamesk.au [...] gmail.com> |
I am calling fastdistance with these parameters:
print fastdistance("Distinction courses", "Distinction Courses");
The value printed is 13, not 1, as might be expected.
The only difference between the strings is the capitalisation of the letter
C in the second word (i.e. one substitution).
The value calculated appears to be equal to the number of identical
characters preceding the different character plus one.
If the capital "C" in the second string is changed to a lowercase "d", the
value printed is still 13.
If the capital "C" in the second string is instead changed to a lowercase
"c" and the "O" is capitalised instead, the value increases to 14.
Running Perl v5.10.0 built for MSWin32-x86-multi-thread under Vista Home
Premium SP1.
I have tried v 0.05 as well as v 0.06_01 of Text::Levenshtein and the result
is the same. I am amazed that no one else has encountered (and reported)
this since 2004.
Kind regards
James King