Subject: | segmentation fault when assigning to $MATCH |
The attached file recreates the problem. It's baffling me. The grammar is a pared down version
of the grammar used in TPath. If you comment out the string modification line in the cleaning
subrouting, or force it to return a constant before hitting this line, it parses fine. If you leave this
line it, there is a segmentation fault. This is on OS X 10.8.2 running Perl v5.12.4 built for
darwin-thread-multi-2level. This does not occur on my Ubuntu box running v5.14.
My guess is that this is caused by invoking the regex engine within the regex engine and is out
of your hands, but here's the report at any rate.
Subject: | grammar.pl |
use v5.10;
use strict;
use warnings;
use Data::Dumper;
our %AXES = map { $_ => 1 } qw(
ancestor
ancestor-or-self
child
descendant
descendant-or-self
following
following-sibling
leaf
parent
preceding
preceding-sibling
self
sibling
sibling-or-self
);
our $path_grammar = do {
use Regexp::Grammars;
qr{
<nocontext:>
<timeout: 100>
^ <path> $
<token: path> <[segment=first_step]> <[segment=subsequent_step]>*
<token: first_step> <separator>? <step>
<token: subsequent_step> <separator> <step>
<token: separator> \/[\/>]?+
<token: step> <full> <[predicate]>*
<token: full> <axis>? <forward>
<token: axis> (?<!//) (?<!/>) (<%AXES>) :: (?{ $MATCH = $^N })
<token: forward> <wildcard> | <specific>
<token: wildcard> \* | <error:>
<token: specific>
( <.name> )
(?{ $MATCH = clean_escapes($^N) })
<token: name>
(?>\\.|[\p{L}\$_])(?>[\p{L}\$\p{N}_]|[-.:](?=[\p{L}_\$\p{N}])|\\.)*+
<rule: predicate> \[ \d+ \]
}x;
};
my $path = '//a';
if ( $path =~ $path_grammar ) {
print Dumper \%/;
}
else {
say 'Huh. It should have parsed.';
}
sub clean_escapes {
my $m = shift // '';
$m =~ s/\\(.)/$1/g;
return $m;
}