Skip Menu |

This queue is for tickets about the Regexp-Grammars CPAN distribution.

Report information
The Basics
Id: 83902
Status: open
Priority: 0/
Queue: Regexp-Grammars

People
Owner: Nobody in particular
Requestors: DFH [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.001004
Fixed in: (no value)



Subject: segmentation fault when assigning to $MATCH
The attached file recreates the problem. It's baffling me. The grammar is a pared down version of the grammar used in TPath. If you comment out the string modification line in the cleaning subrouting, or force it to return a constant before hitting this line, it parses fine. If you leave this line it, there is a segmentation fault. This is on OS X 10.8.2 running Perl v5.12.4 built for darwin-thread-multi-2level. This does not occur on my Ubuntu box running v5.14. My guess is that this is caused by invoking the regex engine within the regex engine and is out of your hands, but here's the report at any rate.
Subject: grammar.pl
use v5.10; use strict; use warnings; use Data::Dumper; our %AXES = map { $_ => 1 } qw( ancestor ancestor-or-self child descendant descendant-or-self following following-sibling leaf parent preceding preceding-sibling self sibling sibling-or-self ); our $path_grammar = do { use Regexp::Grammars; qr{ <nocontext:> <timeout: 100> ^ <path> $ <token: path> <[segment=first_step]> <[segment=subsequent_step]>* <token: first_step> <separator>? <step> <token: subsequent_step> <separator> <step> <token: separator> \/[\/>]?+ <token: step> <full> <[predicate]>* <token: full> <axis>? <forward> <token: axis> (?<!//) (?<!/>) (<%AXES>) :: (?{ $MATCH = $^N }) <token: forward> <wildcard> | <specific> <token: wildcard> \* | <error:> <token: specific> ( <.name> ) (?{ $MATCH = clean_escapes($^N) }) <token: name> (?>\\.|[\p{L}\$_])(?>[\p{L}\$\p{N}_]|[-.:](?=[\p{L}_\$\p{N}])|\\.)*+ <rule: predicate> \[ \d+ \] }x; }; my $path = '//a'; if ( $path =~ $path_grammar ) { print Dumper \%/; } else { say 'Huh. It should have parsed.'; } sub clean_escapes { my $m = shift // ''; $m =~ s/\\(.)/$1/g; return $m; }
Subject: Re: [rt.cpan.org #83902] segmentation fault when assigning to $MATCH
Date: Tue, 12 Mar 2013 13:56:54 +0000
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Thanks, David. I can confirm that there's no problem under 5.14 on MacOS. Like you, I'm guessing it's the nested regex match...and hence beyond my help. Out of interest, if you replace clean_escapes() with a non-regexy version: sub clean_escapes { my $m = shift // ''; while (1) { my $offset = index($m, '\\'); last if $offset < 0 || $offset == length($m)-1; substr($m, $offset, 1) = q{}; } return $m; } ...does that eliminate your segfault? Damian
CC: DFH [...] cpan.org
Subject: Re: [rt.cpan.org #83902] segmentation fault when assigning to $MATCH
Date: Tue, 12 Mar 2013 10:50:58 -0400
To: bug-Regexp-Grammars [...] rt.cpan.org
From: David Houghton <dfhoughton [...] gmail.com>
Yes, I actually tried that immediately after filing the bug report and there were no more segfaults. So it's the regex engine. On Tue, Mar 12, 2013 at 9:57 AM, damian@conway.org via RT < bug-Regexp-Grammars@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=83902 > > > Thanks, David. > > I can confirm that there's no problem under 5.14 on MacOS. > Like you, I'm guessing it's the nested regex match...and hence > beyond my help. > > Out of interest, if you replace clean_escapes() with a non-regexy version: > > sub clean_escapes { > my $m = shift // ''; > while (1) { > my $offset = index($m, '\\'); > last if $offset < 0 || $offset == length($m)-1; > substr($m, $offset, 1) = q{}; > } > return $m; > } > > ...does that eliminate your segfault? > > Damian > >