Subject: | Prohibit numbered captures in Perl 5.10 |
Perl 5.10 offers a new way of using string captured in regular
expressions - instead of using $1, $2 etc., you can give them names and
refer to them using %+ or its English alias %LAST_PAREN_MATCH (thanks to
Elliot Shank for clarifying that last bit in bug #35969).
This improves code readability, much like use English, so when running
with a Perl version that has it (5.010 or higher) P::C should suggest
using it instead of $1.
An example is attached.
Suggested new warnings:
* 'Anonymous capture used in regular expression' - when using '(expr)'
instead of '(?<capture_name>expr)' in the regex.
* 'Numbered capture used instead of named captures' - when using any of
the $<digit> variables.
* 'Numbered backreference used in regular expression' - when using \1
instead of \k<capture_name>
Of course, feel free to suggest further improvements and better wording
for the warnings. (English is not my native language.) I use P::C a lot,
but i am far from being an expert in its intricacies. This suggestion
probably affects other policies, but i'm not sure which ones.
Subject: | named_captures.pl |
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
use English qw(-no_match_vars);
our $VERSION = 0.1;
my $string = 'Perl is a language for getting your job done.';
# The next line should trigger the warning:
# 'Anonymous capture used in regular expression'
if ($string =~ /\b(\w)\b/xms) {
# The next line should trigger the warning:
# 'Numbered capture used instead of named captures'
say "the string has the one-char word $1";
}
# This block shouldn't trigger any warnings
if ($string =~ /\b(?<one_char_word>\w)\b/xms) {
say "the string has the one-char word $LAST_PAREN_MATCH{one_char_word}";
}
# The next should trigger three warnings:
# 'Anonymous capture used in regular expression'
# 'Numbered backreference used in regular expression'
if ($string =~ /(.)\1/xms) {
# The next line should trigger the warning:
# 'Numbered captures used instead of named captures'
say "the string has the repeating char $1";
}
# This block shouldn't trigger any warnings.
# (?<repeating_char>) and (?'repeating_char'.) are supposed to be equivalent.
if ($string =~ /(?'repeating_char'.)\k{repeating_char}/xms) {
say "the string has the repeating char $LAST_PAREN_MATCH{repeating_char}";
}
exit;
__END__