Subject: | SLIF event handling returns impossible events |
The documentation states that any SLIF recognizer mutator such as `read` or `lexeme_read` can create new events, which have to be queried as soon as possible. When I read an externally scanned lexeme via `$recce->lexeme_read` and then query for events, the events that caused the parsing pause in the first place are still present in the return value of `$recce->events`. However, these events are impossible at the G1 location after the `lexeme_read`.
User code can probably use a workaround where the event system maintains state to weed out previously seen events, but doing so is prohibitively complicated.
The attached file shows a simple script that exercises the buggy behaviour. Note that the problem does not occur at input position zero, but first needs some prefix (this code uses the string "start").
This bug is present in Marpa v2.09000 at least.
Subject: | marpa_buggy_events.pl |
use feature 'say';
use Marpa::R2;
my $g = Marpa::R2::Scanless::G->new({
source => \q{
Top::= 'start' TOKEN OTHER_TOKEN
TOKEN ~ [^\s\S]
OTHER_TOKEN ~ [^\s\S]
event ev_token = predicted TOKEN
event ev_other_token = predicted OTHER_TOKEN
},
});
my $r = Marpa::R2::Scanless::R->new({ grammar => $g });
# Start the parse. This pauses after the "start".
$r->read(\"start_");
{
my @events = map { $_->[0] } @{ $r->events };
say "[@events]"; # [ev_token]
say $r->show_progress;
}
# Reading the expected token triggers another prediction event.
# This reads the "_" part in the input. The bug is present regardless of the
# length of the token, and it can be of zero length.
$r->lexeme_read(TOKEN => 0, 1, "_");
{
# As per the documentation, lexeme_read is a mutator so we must query for
# events afterwards.
my @events = map { $_->[0] } @{ $r->events };
say "[@events]"; # [ev_token ev_other_token]
say $r->show_progress;
# Why is ev_token still here? The progress report clearly shows that
# TOKEN is not predicted at this position.
}