Skip Menu |

This queue is for tickets about the Regexp-Grammars CPAN distribution.

Report information
The Basics
Id: 124825
Status: rejected
Priority: 0/
Queue: Regexp-Grammars

People
Owner: Nobody in particular
Requestors: alexchandel [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Subrule returns empty string instead of value with MATCH
When a subrule with a 0-or-more list fails matches 0 elements, and assigns that match to MATCH, an empty string is returned to the calling rule, rather than the value the expression (undef in this case, although an empty array would make more sense). <nocontext:> <token: decls> <[MATCH=decl]>* <token: body> <decls> Note that this does NOT happen with an explicit, separate MATCH tag: <nocontext:> <token: decls> <[decl]>* <MATCH=(?{ $MATCH{decl} })> <token: body> <decls> This may indicate an issue with the implementation of MATCH in a list-like subrule call.
Subject: Re: [rt.cpan.org #124825] Subrule returns empty string instead of value with MATCH
Date: Tue, 20 Mar 2018 10:15:43 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Hi Alex, Thanks for the report, but (in my view) this is not a bug; it's the defined behaviour...albeit, perhaps a surprising one. You're assuming that: <[MATCH=decl]>* is a shorthand for: <[decl]>* <MATCH=(?{ $MATCH{decl} })> Although that is entire plausible, it's not the case. It's actually a shorthand for: (?: (<decl>) (?{ $MATCH = [] unless ref($MATCH) eq 'ARRAY'; push @$MATCH, $CAPTURE }) )* In other words, each time the <[decl]> subtoken matches, what it matches is pushed onto the $MATCH, converting it to an arrayref on the first occasion. But when the zero-or-more quantifier matches zero times, the trailing code is never executed, so the default value of $MATCH (an empty string) is never modified. I don't propose to change that behaviour, but I will add a caveat to the documentation pointing it out. Meanwhile, the workaround is to set up the default return value you want yourself, either using the variant you included in your report, or else with: (?{ $MATCH = [] }) <[decl]>* I'm sorry for the trouble this caused you. Damian
From: alexchandel [...] gmail.com
On Mon Mar 19 19:16:43 2018, damian@conway.org wrote: Show quoted text
> Hi Alex, > > Thanks for the report, but (in my view) this is not a bug; > it's the defined behaviour...albeit, perhaps a surprising one. > > You're assuming that: > > <[MATCH=decl]>* > > is a shorthand for: > > <[decl]>* <MATCH=(?{ $MATCH{decl} })> > > Although that is entire plausible, it's not the case. > It's actually a shorthand for: > > (?: (<decl>) (?{ $MATCH = [] unless ref($MATCH) eq 'ARRAY'; push > @$MATCH, $CAPTURE }) )* > > In other words, each time the <[decl]> subtoken matches, > what it matches is pushed onto the $MATCH, converting it > to an arrayref on the first occasion. > > But when the zero-or-more quantifier matches zero times, > the trailing code is never executed, so the default value of $MATCH > (an empty string) is never modified. > > I don't propose to change that behaviour, but I will add a caveat > to the documentation pointing it out. > > Meanwhile, the workaround is to set up the default return value you want > yourself, either using the variant you included in your report, or else > with: > > (?{ $MATCH = [] }) > <[decl]>* > > I'm sorry for the trouble this caused you. > > Damian
Hi Damian, Thank you for the default value tip, I didn't realize it was possible to do something like that! With regards to $MATCH returning an empty string when the zero-or-like quantifier matches zero times, why does something like this: <nocontext:> <token: decls> <[decl]>* <token: body> <decls> result in <[decl]> being undef when it matches zero times, while the default $MATCH for <[decl]> is an empty string when it matches zero times?
Subject: Re: [rt.cpan.org #124825] Subrule returns empty string instead of value with MATCH
Date: Wed, 21 Mar 2018 13:57:08 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Show quoted text
> With regards to $MATCH returning an empty string when the zero-or-like > quantifier matches zero times, why does something like this: > > <nocontext:> > <token: decls> > <[decl]>* > <token: body> > <decls> > > result in <[decl]> being undef when it matches zero times, > while the default $MATCH for <[decl]> is an empty string > when it matches zero times?
For more or less the same reason. The approximate translation of: <token: decls> <[decl]>* is: (?<decls> ( (?: ( (?&decl) ) (?{ push @{$MATCH{decl}}, $CAPTURE }) )* ) (?{ $MATCH //= $CAPTURE }) ) So, $MATCH{decl} is only autovivified if one or more of the subcalls succeeds, but $MATCH is set each time, even if the repetition matches zero times. Damian