Skip Menu |

This queue is for tickets about the Regexp-Grammars CPAN distribution.

Report information
The Basics
Id: 75824
Status: open
Priority: 0/
Queue: Regexp-Grammars

People
Owner: Nobody in particular
Requestors: ingosch [...] gmx.at
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.016
Fixed in: 1.016



Hi, sorry for reporting yet another issue. Whenever I try to access the result hash or the program exits, I get a segmentation fault. Here is the simple version of the grammar, the input string is in the file attached. The grammar parses ok. When I use test data, it is ok. When I use just the first ...<XProtocol{ ... } block, it is ok. When I change <[brb]> to be amneasic (<.brb>, I get further, even to a dump, yet still a segfault. Very annoying Is this some stack size issue? my $parse_simple =do { use Regexp::Grammars; Regexp::Grammars::set_context_width(50); qr{ #<nocontext:> <[simpler]>+ % <[char]> <rule: brb> <br> (<[sc=char]>? <[bla=brb]>?)* </br> <rule: simpler> (<char>\<XProtocol>) #<debug: on> <[brb]> <token: char> [^{}]* <token: br> \{ }xms; };
Subject: in.txt

Message body is not shown because it is too large.

Subject: out.txt

Message body is not shown because it is too large.

Subject: Re: [rt.cpan.org #75824]
Date: Sat, 17 Mar 2012 08:21:49 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Show quoted text
> Whenever I try to access the result hash or the program exits, I get a > segmentation fault. > > When I change <[brb]> to be amneasic (<.brb>, I get further, even to a > dump, yet still a segfault. Very annoying > Is this some stack size issue?
Yes. It's a problem with the underlying regex implementation, specifically with the way the internal data structures grow nonlinearly with increasing input and capture size. It varies with the version of Perl you're using, but also with your available memory, your operating system (and its virtual memory model), as well as the options with which your version of Perl was compiled, and the number and complexity of rules and captures in your grammar (in particular, how much backtracking it requires). For example, on my MacBook Pro, with 8GB RAM, running Perl 5.14.2, your grammar can parse an input approximately 16 times larger than the one you attached, before I run out of memory and seg fault. There is no work-around for this at present, except to try a create a more efficient grammar, or to switch to a different parser generator (e.g. Marpa). Sorry, Damian