Skip Menu |

This queue is for tickets about the Regexp-Common CPAN distribution.

Report information
The Basics
Id: 55549
Status: rejected
Priority: 0/
Queue: Regexp-Common

People
Owner: Nobody in particular
Requestors: scop [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 2010010201
Fixed in: (no value)



Subject: Word boundary limits or documentation missing
For example, $RE{URI}{HTTP} finds HTTP URIs in "foohttp://..." and "foo- http://...". Avoiding matching the foo-http case could be somewhat tricky, but placing a simple \b before the scheme names in URI would be an improvement in my opinion. I think Regexp::Common should either place word boundary limits where appropriate, or document that people should be dealing with them themselves (I couldn't find such a note in the docs). Not sure if the issue exists with other regexps as well besides the URI one I checked.
On Sun Mar 14 05:27:39 2010, SCOP wrote: Show quoted text
> For example, $RE{URI}{HTTP} finds HTTP URIs in "foohttp://..." and "foo- > http://...". Avoiding matching the foo-http case could be somewhat > tricky, but placing a simple \b before the scheme names in URI would be > an improvement in my opinion. > > I think Regexp::Common should either place word boundary limits where > appropriate, or document that people should be dealing with them > themselves (I couldn't find such a note in the docs). Not sure if the > issue exists with other regexps as well besides the URI one I checked.
Regexp::Common does not behave differently from any other pattern. Searching for /http/ also matches against "foohttp". It would be easy to add an anchor to all the patterns, but then the patterns would be impossible to use for someone who wants to find the patterns even if not anchored. And if anchors are added, which one should it be? \b? ^? \A? \G? It's better to leave this to the user - only the user can know if an anchor is needed, and which anchor it ought to be.
Won't fix, but a note has been added to the documentation.