Bug #98587 for Marpa-R2: SLIF event handling returns impossible events

Wed Sep 03 10:23:06 2014 AMON [...] cpan.org - Ticket created

Subject:

SLIF event handling returns impossible events

The documentation states that any SLIF recognizer mutator such as `read` or `lexeme_read` can create new events, which have to be queried as soon as possible. When I read an externally scanned lexeme via `$recce->lexeme_read` and then query for events, the events that caused the parsing pause in the first place are still present in the return value of `$recce->events`. However, these events are impossible at the G1 location after the `lexeme_read`. User code can probably use a workaround where the event system maintains state to weed out previously seen events, but doing so is prohibitively complicated. The attached file shows a simple script that exercises the buggy behaviour. Note that the problem does not occur at input position zero, but first needs some prefix (this code uses the string "start"). This bug is present in Marpa v2.09000 at least.

Subject:

marpa_buggy_events.pl

use feature 'say'; use Marpa::R2; my $g = Marpa::R2::Scanless::G->new({ source => \q{ Top::= 'start' TOKEN OTHER_TOKEN TOKEN ~ [^\s\S] OTHER_TOKEN ~ [^\s\S] event ev_token = predicted TOKEN event ev_other_token = predicted OTHER_TOKEN }, }); my $r = Marpa::R2::Scanless::R->new({ grammar => $g }); # Start the parse. This pauses after the "start". $r->read(\"start_"); { my @events = map { $_->[0] } @{ $r->events }; say "[@events]"; # [ev_token] say $r->show_progress; } # Reading the expected token triggers another prediction event. # This reads the "_" part in the input. The bug is present regardless of the # length of the token, and it can be of zero length. $r->lexeme_read(TOKEN => 0, 1, "_"); { # As per the documentation, lexeme_read is a mutator so we must query for # events afterwards. my @events = map { $_->[0] } @{ $r->events }; say "[@events]"; # [ev_token ev_other_token] say $r->show_progress; # Why is ev_token still here? The progress report clearly shows that # TOKEN is not predicted at this position. }

Wed Sep 03 12:30:54 2014 jkegl [...] cpan.org - Status changed from 'new' to 'open'

Wed Sep 03 12:30:54 2014 jkegl [...] cpan.org - Taken

Wed Sep 03 12:30:54 2014 jkegl [...] cpan.org - Broken in 2.090000 added

Wed Sep 03 12:30:54 2014 jkegl [...] cpan.org - Severity Important added

Wed Sep 03 12:31:53 2014 jkegl [...] cpan.org - Correspondence added

Thanks! It does look like a real issue.

Thu Sep 04 00:21:59 2014 jkegl [...] cpan.org - Correspondence added

I believe I've located the problem -- it's probably in the XS code for the $slr->lexeme_read() and associated calls. They may not be clearing the queue when they should. I working on a fix.

Sun Sep 07 12:54:25 2014 jkegl [...] cpan.org - Correspondence added

This is fixed in commit 4257590c7e32a651d1eb6f60ad8b86c901c1b95b. I'm currently working on a developer's release to CPAN, to check that all is OK. On Thu Sep 04 00:21:59 2014, JKEGL wrote: Show quoted text

> I believe I've located the problem -- it's probably in the XS code for > the $slr->lexeme_read() and associated calls. They may not be > clearing the queue when they should. > > I working on a fix.

Mon Sep 08 20:53:34 2014 jkegl [...] cpan.org - Correspondence added

I am marking this ticket resolved. This is now in the release candidate Marpa-R2 2.091_001 and all tests look OK. If all continues to go smoothly it will be in a stable indexed release sometime this week. Thanks!

Mon Sep 08 20:53:35 2014 jkegl [...] cpan.org - Status changed from 'open' to 'resolved'