Subject: | possible bug in hso, symmetric versus asymetric values |
The following was reported by Hideki Shima of CMU.
-----------------------------------------------------
(3) HSO: sim(x, y) != sim(y, x)
-----------------------------------------------------
Given some pairs of words, HSO is asymmetric, meaning that
argument order affects the score. For instance, Perl version
web interface on maraca returns the following result.
The relatedness of growth#n#6 and vitality#n#2 using hso is 2.
The relatedness of vitality#n#2 and growth#n#6 using hso is 3.
I was able to reproduced the same outcome in the command
line program with individual words given as argument. However,
with --file option, it gave symmetric results.
<log>
$ cat file.txt
vitality#n#2 growth#n#6
growth#n#6 vitality#n#2
$ perl ./similarity.pl --type WordNet::Similarity::hso --file file.txt
Loading WordNet... done.
Loading Module... done.
vitality#n#2 growth#n#6
vitality#n#2 growth#n#6 3
growth#n#6 vitality#n#2
growth#n#6 vitality#n#2 3
</log>
Below are some more of such cases.
average#a#2 excellent#a#1 4
excellent#a#1 average#a#2 0
cotton#v#1 like#v#2 4
like#v#2 cotton#v#1 5
kick_the_bucket#v#1 assassinate#v#1 0
assassinate#v#1 kick_the_bucket#v#1 5
prosecution#n#1 double_jeopardy#n#1 5
double_jeopardy#n#1 prosecution#n#1 4
liberty#n#2 freedom#n#1 4
freedom#n#1 liberty#n#2 5
This phenomenon is very rare and has not been observed in 10k
randomly generated noun-noun pairs of synsets.