Skip Menu |

This queue is for tickets about the String-Approx CPAN distribution.

Report information
The Basics
Id: 18242
Status: resolved
Priority: 0/
Queue: String-Approx

People
Owner: Nobody in particular
Requestors: jpritikin [...] pobox.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 3.25
Fixed in: (no value)



Subject: bizarre aslice results
On Wed, Mar 15, 2006 at 03:39:56PM +0530, Joshua N Pritikin wrote: Show quoted text
> adist("abase", "disabuse") == 1 > > I understand why, because adist("abase", "abuse")==1 but actually I
want the Show quoted text
> edit distance to include the "dis" prefix as well. In other words, I
want: Show quoted text
> > adistxxx("abase", "disabuse") == 4
FYI, it seems like this usually does what I want: sub edist { my ($pat, @in) = @_; my @s = aslice($pat, ["minimal_distance"], @in); my @r; for (my $x=0; $x < @in; $x++) { my $s = $s[$x]; my $total = $s->[2] + length($in[$x]) - $s->[1]; push @r, $total; } @r; } However, the following seems wrong: edist("algesia", "analgesic"); #index=0, size=9, distance=1 I would expect index=2, size=7, distance=1 or index=0, size=9, distance=3. Here's another example. It seems like some confusion is triggered by a matching postfix: edist('rostrum', 'nostrum'); #index=4, size=3, distance=1 The distance is OK, but I don't understand the index and size.
Subject: Re: [rt.cpan.org #18242] bizarre aslice results
Date: Sun, 19 Mar 2006 16:42:31 +0200
To: bug-String-Approx [...] rt.cpan.org
From: Jarkko Hietaniemi <jhietaniemi [...] gmail.com>
For your intended use I suggest using Levenshtein distance instead.
On Sun Mar 19 09:42:23 2006, jhi@iki.fi wrote: Show quoted text
> For your intended use I suggest using Levenshtein distance instead.
Ah, OK ... I find that Text::LevenshteinXS gives sane results and the API is dead simple. I'm sold. Thanks.