Skip Menu |

This queue is for tickets about the WordNet-Similarity CPAN distribution.

Report information
The Basics
Id: 86441
Status: open
Priority: 0/
Queue: WordNet-Similarity

People
Owner: Nobody in particular
Requestors: TPEDERSE [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: possible bug in depth finding of wup
The following was reported via Hideki Shima of CMU via email. I wanted to get this recorded here as well. ----------------------------------------------------- (1) WUP: Deepest LCS is not found ----------------------------------------------------- When coffee#n#1 tea#n#1 is given, the following trace is obtained: <trace> HyperTree: *Root*#n#1 entity#n#1 physical_entity#n#1 matter#n#3 substance#n#7 food#n#1 beverage#n#1 coffee#n#1 HyperTree: *Root*#n#1 entity#n#1 physical_entity#n#1 matter#n#3 substance#n#1 fluid#n#1 liquid#n#1 beverage#n#1 coffee#n#1 HyperTree: *Root*#n#1 entity#n#1 abstraction#n#6 relation#n#1 part#n#1 substance#n#1 fluid#n#1 liquid#n#1 beverage#n#1 coffee#n#1 HyperTree: *Root*#n#1 entity#n#1 physical_entity#n#1 matter#n#3 substance#n#7 food#n#1 beverage#n#1 tea#n#1 HyperTree: *Root*#n#1 entity#n#1 physical_entity#n#1 matter#n#3 substance#n#1 fluid#n#1 liquid#n#1 beverage#n#1 tea#n#1 HyperTree: *Root*#n#1 entity#n#1 abstraction#n#6 relation#n#1 part#n#1 substance#n#1 fluid#n#1 liquid#n#1 beverage#n#1 tea#n#1 Lowest Common Subsumers: beverage#n#1 (Depth=7) Depth(coffee#n#1) = 8 Depth(tea#n#1) = 8 coffee#n#1 tea#n#1 0.875 </trace> The deepest subsumer seems to be beverage#n#1 (Depth=9) from the 3rd and 6th hyper-tree. If so, the score would be 2 * 9 / (10 + 10) = 0.9 Below are some more of such cases. wup( goat#n#3 , scapegoat#n#1 ) = 2 * 8 / (9 + 11) = 0.8000? wup( boy#n#1 , sage#n#1 ) = 2 * 8 / (10 + 11) = 0.7619? wup( successor#n#1 , precursor#n#2 ) = 2 * 8 / (10 + 9) = 0.842? wup( h2o#n#1, co2#n#1 ) = 2 * 9 / (11 + 12) = 0.7826? wup( tobacco#n#1, alcohol#n#1 ) = 2 * 8 / (9 + 9) = 0.8889? This phenomenon has been observed in about 200 out of 10k (=2%) randomly generated noun-noun pairs of synsets. Currently, wup score is estimated to be lower than the expected score by 0.09 on average over these 2% of pairs.
problem has been documented in TODO list of WordNet-Similarity 2.07 patches are welcome :)