Skip Menu |

This queue is for tickets about the Regexp-Grammars CPAN distribution.

Report information
The Basics
Id: 72566
Status: open
Priority: 0/
Queue: Regexp-Grammars

People
Owner: Nobody in particular
Requestors: DAMI [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.014
Fixed in: (no value)



Subject: objrule forcefully injects context string within object internals
<objrule: Some::Class> calls the ->new() method on that class, as documented ... but after that, it forcefully injects the context string under hash key "" into that blessed object! This is bad because a) the object's internals contain data that the object author never intended to find there b) the object has to be a hashref; if you want to bless something else (an arrayref or a scalarref), then you get an error. Below is a small program to illustrate that case. Curiously, playing with the <nocontext:> directive, it seems to be possible to remove some context strings, but not all of them. =========================================== use strict; use warnings; use Regexp::Grammars; use YAML qw/Dump/; my $grammar = qr[ <nocontext:> # only works for top-level <Foo> <objrule: Tst::Foo> <nocontext:> # only worlds for this level (top) FOO \( <a=Bar> , <b=Bar> AND <c=Bar> \) <objrule: Tst::Bar> <nocontext:> # doesn't work at all BAR <num=(\d+)> ]xms; my $string = "FOO (BAR 1, BAR 2 AND BAR 3)"; $string =~ $grammar and print Dump $/{Foo}; package Tst::Foo; sub nonew { my ($class, $parsed) = @_; my $self = {}; $self->{"attr_$_"} = $parsed->{$_} for qw/a b c/; bless $self, $class; } package Tst::Bar; sub nonew { my ($class, $parsed) = @_; # can't have arrayrefs as objects: line below creates an error # my $self = [$parsed->{num}]; # so let's just use a regular object my $self = {val => $parsed->{num}}; bless $self, $class; }
Subject: Re: [rt.cpan.org #72566] objrule forcefully injects context string within object internals
Date: Wed, 23 Nov 2011 09:55:45 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Hi Laurent, Thanks for the bug report. The behaviour of <nocontext:> is definitely a bug and I will be working to fix it ASAP. The issue of <objrules> is more complex. I agree that it is awkward that the module does not accept non-hash-based classes and it is quite rude of it to inject context info into hash-based objects. However, this behaviour is deeply embedded in the design (and for good reasons) and will be much harder to change, mainly because the module has to do a huge amount of object cloning in order to ensure that backtracking works correctly. I'm not even sure what the correct fix would be. At the moment I plan to explicitly restrict objects to hash-based (and raise a fatal error if objects of other types are used). As for the context injection, I can't decide if R::G should simply not inject the context string to any object, or whether it should attempt to call a method such as: $obj->set_context($CONTEXT) and then directly inject only if that call fails. The problem with the first approach (no injection at all) is that objects don't then get a context string. The problem with the second approach (try a method call first) is that it forces classes to provide a particular method name, which may be inconvenient. Do you you have any opinion? Damian
Subject: Re: [rt.cpan.org #72566] objrule forcefully injects context string within object internals
Date: Wed, 23 Nov 2011 04:58:22 +0100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: laurent dami <laurent.dami [...] free.fr>
Le 22.11.2011 23:56, damian@conway.org via RT a écrit : Show quoted text
> [...] > Do you you have any opinion? >
Ideally, the $class->new() method should receive the context string within its \%result_hash argument, and then that method should decide for itself what it wants to keep from that %result_hash. The problem right now is that the context string is absent from %result_hash, so the constructor has no chance to make use of it. Then R::G injects the context string, but the constructor is no longer in control ! If, for implementation reasons, you cannot pass the context string at the time the constructor gets called, then my preference would be to drop it altogether. Maybe a fallback solution would be to make that context string available through a global variable, or through a R::G class method, so that the client code has a chance to get at it when necessary.
Subject: Re: [rt.cpan.org #72566] objrule forcefully injects context string within object internals
Date: Thu, 24 Nov 2011 15:21:12 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Show quoted text
> Ideally, the $class->new() method should receive the context string > within its \%result_hash argument, and then that method should decide > for itself  what it wants to keep from that %result_hash.
Yes, I was afraid you would suggest that. I agree that it is the correct solution, but is also the one that requires the most work in refactoring R::G. ;-) However, in thinking about that, I realize that this work is needed anyway to fix the broken <nocontext:> behaviour, so that is what I shall do. Once again, many thanks for your feedback and suggestions, Laurent. Damian