Subject: | bug in String::Approx 'adist'? |
Date: | Tue, 8 Apr 2014 05:20:56 +0000 |
To: | "bug-String-Approx [...] rt.cpan.org" <bug-String-Approx [...] rt.cpan.org> |
From: | Qiongyi Zhao <q.zhao [...] uq.edu.au> |
Dear developer(s) of the "String-Approx" module,
I found a strange result when I tried to get the edit distance between two strings. My testing perl script is listed below:
########################## a simple perl script to show the bug ##########################
#!/usr/bin/perl -w
use strict;
use warnings;
use String::Approx 'adist';
my $seq1="ATCTGACACATGTTTACTTTGTAGCTTAGCCCCAACAAACACACACTXAGGAGAGTCTACATTCXCTGCTTGAATCCTAGTTACGACAGCAACAGGTCTG";
my $seq2="ATCTGACACATGTTTACTTTGTAGCTTAGCCCCAACAAACACACACTCAGGAGAGTCTACATTCACTGCTTGAATCCTAGTTACGACAGCAACAGGTCTG";
my $distance = adist($seq1, $seq2); #fuzzy match
print STDERR "$distance\n";
########################## a simple perl script to show the bug ##########################
The output is "1" from this simple perl script.
It is obvious to see that two letters are different between these two strings (marked as below), but how come the distance is only 1? Is it a bug for String::Approx 'adist'?
my $seq1="ATCTGACACATGTTTACTTTGTAGCTTAGCCCCAACAAACACACACT[X]AGGAGAGTCTACATTC[X]CTGCTTGAATCCTAGTTACGACAGCAACAGGTCTG";
my $seq2="ATCTGACACATGTTTACTTTGTAGCTTAGCCCCAACAAACACACACT[C]AGGAGAGTCTACATTC[A]CTGCTTGAATCCTAGTTACGACAGCAACAGGTCTG";
Hope you could fix this bug. Thanks very much for your attention and for this nice module.
# info of my testing environment
Distribution name and version: String-Approx-3.27
Perl version: perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-thread-multi
Operating System vendor and version: Linux cluster.qbi.uq.edu.au 2.6.32-220.13.1.el6.x86_64 #1 SMP Tue Apr 17 23:56:34 BST 2012 x86_64 x86_64 x86_64 GNU/Linux
Best regards,
Qiongyi
Qiongyi Zhao| PhD |Bioinformatics
Queensland Brain Institute | The University of Queensland
Tel: +61 7 33466429| Mobile: 04 01219121 | Email: q.zhao@uq.edu.au<mailto:s.lee12@uq.edu.au>