Subject: | sdiff() quirk for slightly different inputs |
Date: | Sat, 1 Feb 2020 10:02:03 -0800 |
To: | bug-Algorithm-Diff [...] rt.cpan.org |
From: | Al Danial <al.danial [...] gmail.com> |
hi, I found an unusual behavior with Algorithm::Diff's sdiff() function
where a single space
character in the input changes the results notably. Here's a program that
demonstrates
what I'm talking about
Show quoted text
________________________
#!/usr/bin/env perl
use Algorithm::Diff qw ( sdiff );
# Demonstrate sdiff() quirk for a pair of of neary identical inputs;
# only the 2nd line of each case differs by one space.
# al.danial@gmail.com 2020-02-01
# Case 1:
#
# Left Right
#
# "{ " | "{ ",
# " } //" | " } ",
# " }" | " }",
# "}" | "}"
# "//" |
# Case 2:
#
# Left Right
#
# "{ " | "{ ",
# " }//" | " }",
# " }" | " }",
# "}" | "}"
# "//" |
# Case 1 sdiff() result is as expected: 3 unchanged, 1 changed, 1 removed
# Case 2 sdiff() result is unexpected : 3 unchanged, 2 added, 1 removed
my @L_1 = (
'{ ',
' } //',
' }',
'}',
'//'
);
my @L_2 = (
'{ ', # == $L_1[0]
' }//', # same as $L_1[1] but no space between brace and /
' }', # == $L_1[2]
'}', # == $L_1[3]
'//'
);
my @R_1 = (
'{ ',
' } ',
' }',
'}'
);
my @R_2 = (
'{ ', # == $R_1[0]
' }', # no trailing space compared to $R_1[1]
' }', # == $R_1[2]
'}' # == $R_1[3]
);
my @d_1 = sdiff( \@L_1, \@R_1 );
my @d_2 = sdiff( \@L_2, \@R_2 );
print "Algorithm::Diff::VERSION $Algorithm::Diff::VERSION\n";
print_diff("d_1", \@d_1);
print "----\n";
print_diff("d_2", \@d_2);
sub print_diff {
my ($title, $raa_D,) = @_;
for (my $i = 0; $i < scalar @{$raa_D}; $i++) {
printf "%s %2d. %s [%-10s] [%-10s]\n",
$title, $i+1, $raa_D->[$i][0],
$raa_D->[$i][1], $raa_D->[$i][2];
}
}
________________________
The two cases are the same except for a space after the brace on
the second line. Any possibility the sdiff() algorithm could be updated
to have the result of case #2 match that of case #1, namely, 3 lines
unchanged, 1 changed, and 1 removed?
-- Al