Subject: | Minor bug in class regex |
Hello,
I found a minor bug in the regular expression for a class:
class => qr{ \\ ([Pp]) ( [A-Za-z] | \{ [a-zA-Z]+ \} ) | \[ ( \^? )
( \]? [^][\ \]* (?: (?: \[:\w+:\] | \[ (?!:) | \\. ) [^][\\]* )* ) \] }x,
The problem occurs when you interpolate this variable into the regular
expression with the ^.
if ($self->{CONTENT} =~ s/^$pat{class}//)
Because the second | is not in a () pair, the re effectively looks like this:
^ \\ ([Pp]) ( a and z stuff ) | more stuff
So the string
'\p{dvfdvfd}'
will match and
' \p{dvfdvfd}'
will not, but
'[01[:alpha:]%]'
will match
and
' [01[:alpha:]%]'
will match. It looks like a precendence issue where | has lower
precendence.
So the whole regex needs to be placed in a (?:) or probably simpler,
just put a ^ after the second |.
Best,
Blair