Subject: | Whitespace handling for \n with /m |
Date: | Thu, 4 Jan 2018 15:44:37 +0000 |
To: | "bug-Regexp-Grammars [...] rt.cpan.org" <bug-Regexp-Grammars [...] rt.cpan.org> |
From: | Stefan Eichenberger <se_misc [...] hotmail.com> |
Hi Damian,
Running the below code IMHO displays inconsistent handling of \n-whitespace under modifier /m.
I initially raised the issue over at StackOverflow (https://stackoverflow.com/questions/48042738/regexpgrammars-handling-n/48084744?noredirect=1#comment83153394_48084744), but believe the problem lies in the engine, not the user.
Arguably, I'm new to Regexp::Grammars, so I hesitate to exclude the user though ...
Thx. for your help
Stefan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# this code version reported to bug-Regexp-Grammars, 2018-01-04
use Regexp::Grammars;
my($text, $parser);
$text = "line_1_1,line_1_2\nline_2_1,line_2_2";
$i = 1;
print "Example $i: 2nd line match contains \\n despite '.' not matching \\n with modifier /m\n";
$parser = qr {
<data>
<rule: data> <[line]>+
<rule: line> .+
}xm;
if ($text =~ $parser) { print "Matched $i"; } else { print "Not matched $i"; }
print "\npause $i...\n\n"; $i++;
print "Example $i: 2nd line match contains \\n despite explicit exclusion\n";
$parser = qr {
<data>
<rule: data> <[line]>+
<rule: line> [^\n]+
}xm;
if ($text =~ $parser) { print "Matched $i"; } else { print "Not matched $i"; }
print "\npause $i...\n\n"; $i++;
print "Example $i: separator \$ seems to consume \\n (using separator \\n also works)\n";
$parser = qr {
<data>
<rule: data> <[line]>+ % $ # Note: \n als works here
<rule: line> .+
}xm;
if ($text =~ $parser) { print "Matched $i"; } else { print "Not matched $i"; }
print "\npause $i...\n\n"; $i++;
print "Example $i: contexts of 'line' matches still contain \\n, but fields no longer; so here explicit exclusion of \\n in rule seems to work\n";
$parser = qr {
<data>
<rule: data> <[line]>+
<rule: line> <[field]>+ % ,
<rule: field> [^,\n]+
}xm;
if ($text =~ $parser) { print "Matched $i"; } else { print "Not matched $i"; }
print "\npause $i...\n\n"; $i++;
print "Example $i: returns 3 fields, where 2nd field contains \\n - probably due to greedy match of 'field'\n";
$parser = qr {
<data>
<rule: data> <[line]>+ % $
<rule: line> <[field]>+ % ,
<rule: field> [^,]+
}xm;
if ($text =~ $parser) { print "Matched $i"; } else { print "Not matched $i"; }
print "\npause $i...\n\n"; $i++;