Date: | Fri, 22 Apr 2005 14:16:43 +0200 |
From: | dakkar <dakkar [...] thenautilus.net> |
To: | tony [...] tmtm.com |
Subject: | [Plucene] patch for Plucene::QueryParser |
Message body not shown because it is not plain text.
It's me again. This time I've found a bug in the Plucene::QueryParser
code, in particular in the handling of 'NOT'.
A query consisting of the word 'notice' would get parsed as 'NOT ice',
which is almost certainly not what was intended.
The attached patch recognizes NOT as an operator only if it is followed
by [^\w:]. It could be better (using the tokenizer, for example), but
it's still better than the previous behaviour.
Hope this is useful.
--
Dakkar - <Mobilis in mobile>
GPG public key fingerprint = A071 E618 DD2C 5901 9574
6FE2 40EA 9883 7519 3F88
key id = 0x75193F88
--- .cpanplus/5.8.2/build/Plucene-1.20/lib/Plucene/QueryParser.pm 2004-02-04 12:38:19.000000000 +0100
+++ /usr/lib/perl5/site_perl/5.8.2/Plucene/QueryParser.pm 2005-04-22 14:10:09.000000000 +0200
@@ -77,7 +77,7 @@
if $item->{conj} eq "||";
}
if (s/^\+//) { $item->{mods} = "REQ"; }
- elsif (s/^(-|!|NOT)\s*//i) { $item->{mods} = "NOT"; }
+ elsif (s/^(-|!|NOT(?=[^\w:]))\s*//i) { $item->{mods} = "NOT"; }
else { $item->{mods} = "NONE"; }
if (s/^([^\s(":]+)://) { $item->{field} = $1 }