Subject: | [PATCH] minor fix at oct/hex escapes |
Date: | Wed, 20 Feb 2008 16:32:24 -0300 |
To: | bug-yape-regex <bug-YAPE-Regex [...] rt.cpan.org>, "Jeff Pinyan" <pinyan [...] cpan.org> |
From: | "Adriano Ferreira" <a.r.ferreira [...] gmail.com> |
Dear Jeff,
At YAPE-Regex 3.03, the current regexes used to parse oct escapes
(like \0 or \03) and hex escapes (like \xF and \xAB) only work for 2
octal digits (which means \03 needs to be \003) and 2 hex digits
(meaning \xF should be written \x0F).
This patch brings changes to fix these. Which are
- hex => qr{ \\ x ( [a-fA-F0-9]{2} ) }x, # actual:
requires two hex digits
+ hex => qr{ \\ x ( [a-fA-F0-9]{1,2} ) }x, # accepts one or
two hex digits
- oct => qr{ \\ ( [0-3] [0-7] [0-7] ) }x, # actual:
requires 0 and two octal digits
+ oct => qr{ \\ ( 0 [0-7]{0,2} ) }x, # accepts zero
to two octal digits
The change in the octal escape, touches two other issues:
* \0 now parses as 'oct' (it was parsing as 'slash')
* an octal escape cannot begin with 1,2,3 which are interpreted as
backref's inside regexes
Corresponding changes were done also for parsing oct/hex escapes
inside char classes. The whole patch is at the end of the message and
as an attachment.
An additional test was provided as well. It uses Test::More. (I can
convert it to Test if you prefer. Or I can convert test.pl into
Test::More if you allow for this change.)
Kind regards,
Adriano Ferreira
diff -ru YAPE-Regex-3.03/Regex.pm YAPE-Regex/Regex.pm
--- YAPE-Regex-3.03/Regex.pm 2007-01-11 11:56:32.000000000 -0200
+++ YAPE-Regex/Regex.pm 2008-02-20 15:32:34.000000000 -0300
@@ -17,8 +17,8 @@
my $ok_cc_REx = qr{
- \\([0-3][0-7]{2}) | # octal escapes
- \\x([a-fA-F0-9]{2}|\{[a-fA-F0-9]+\}) | # hex escapes
+ \\([0-3][0-7]{0,2}) | # octal escapes XXX
\400-\777 is valid too, but utf8
+ \\x([a-fA-F0-9]{1,2}|\{[a-fA-F0-9]+\}) | # hex escapes
\\c(.) | # control characters
\\([nrftbae]) | # known \X sequences
\\N\{([^\}]+)\} | # named characters
@@ -49,8 +49,8 @@
anchor => qr{ ( \\ [ABbGZz] | [\^\$] ) }x,
macro => qr{ \\ ( [dDwWsS] ) }x,
- oct => qr{ \\ ( [0-3] [0-7] [0-7] ) }x,
- hex => qr{ \\ x ( [a-fA-F0-9]{2} ) }x,
+ oct => qr{ \\ ( 0 [0-7]{0,2} ) }x,
+ hex => qr{ \\ x ( [a-fA-F0-9]{1,2} ) }x,
utf8hex => qr{ \\ x \{ ( [a-fA-F0-9]+ ) \} }x,
backref => qr{ \\ ( [1-9] \d* ) }x,
ctrl => qr{ \\ c ( . ) }x,
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.