Skip Menu |

This queue is for tickets about the Parser-MGC CPAN distribution.

Report information
The Basics
Id: 66722
Status: open
Priority: 0/
Queue: Parser-MGC

People
Owner: Nobody in particular
Requestors: PSCUST [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 0.06
Fixed in: (no value)



Subject: Suggestion: token_map
It is common to have a set of keywords that map to some internal code that is used in the parsed code, e.g. my %REG16SP = ( bc => 0, de => 1, hl => 2, sp => 3 ); my $reg = $parser->token_map(\%REG16SP); This sequence checks for the keywords "bc", "de", "hl" or "sp", and returns the corresponding code 0, 1, 2 or 3 if found. The corresponding function could be something like: sub token_map { my($self, $map) = @_; my $error = "Expected any of ". join(", ", sort keys %$map); $self->fail($error) if $self->at_eos; my $pos = pos($self->{str}); my $ident = lc($self->token_ident); if (! exists $map->{$ident} ) { pos($self->{str}) = $pos; $self->fail($error); } return $map->{$ident}; } You may want to include this new method in MGC. If not, I can always keep it in the parser sub-class. Best regards, Paulo Custodio
On Fri Mar 18 16:58:07 2011, PSCUST wrote: Show quoted text
> It is common to have a set of keywords that map to some internal code > that is used in the parsed code, e.g. > > my %REG16SP = ( bc => 0, de => 1, hl => 2, sp => 3 ); > my $reg = $parser->token_map(\%REG16SP); > > This sequence checks for the keywords "bc", "de", "hl" or "sp", and > returns the corresponding code 0, 1, 2 or 3 if found.
A far shorter, neater implementation may be sub token_map { my ($self, $map) = @_; return $map->{ $self->token_kw( keys %$map ) }; } The ->token_kw method already does most of this logic, except for the final mapping at the end. Perhaps you just want to use that instead? That said, by the time you've done that, you could instead have some methods in your own class, for just returning a register number immediately: my %REG16SP = ( bc => 0, de => 1, hl => 2, sp => 3 ); sub token_reg16sp { my $self = shift; return $REG16SP{ $self->token_kw( keys %REG16SP ) }; } -- Paul Evans
Subject: Re: [rt.cpan.org #66722] Suggestion: token_map
Date: Sat, 19 Mar 2011 13:21:16 +0000
To: bug-Parser-MGC [...] rt.cpan.org
From: Paulo Custodio <pauloscustodio [...] gmail.com>
Thanks for your answer, and for the MFC module! I went for the more complex implementation just because it seamed silly to grep on a list of keys of a hash (inside token_kw) when calling exists on the hash key is much quicker. I will try your solution and profile to see if the impact justifies the additional code. In the application I was toying with I had to override token_int, because I need to accept some more integer formats like 0FFh and $FF. And again I needed to do the steps: - skip_ws() - grap pos() - check at_eos() - do my stuff - if error go back to saved pos() and fail() It seams convenient to factor out all this code in a maybe-like method, to be called by application specific token_xxx methods, e.g. doit($code). I know the name is not good, just poped out of the head. Any comments? Also, any reason to call skip_ws() inside at_eos() after grabbing pos() instead of before? The effect is that the same white space is skipped twice. Thanks again. Best regards, Paulo Custodio On Sat, Mar 19, 2011 at 12:21 PM, Paul Evans via RT <bug-Parser-MGC@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=66722 > > > On Fri Mar 18 16:58:07 2011, PSCUST wrote:
>> It is common to have a set of keywords that map to some internal code >> that is used in the parsed code, e.g. >> >> my %REG16SP = ( bc => 0, de => 1, hl => 2, sp => 3 ); >> my $reg = $parser->token_map(\%REG16SP); >> >> This sequence checks for the keywords "bc", "de", "hl" or "sp", and >> returns the corresponding code 0, 1, 2 or 3 if found.
> > A far shorter, neater implementation may be > > sub token_map > { >   my ($self, $map) = @_; >   return $map->{ $self->token_kw( keys %$map ) }; > } > > The ->token_kw method already does most of this logic, except for the > final mapping at the end. Perhaps you just want to use that instead? > > That said, by the time you've done that, you could instead have some > methods in your own class, for just returning a register number immediately: > > my %REG16SP = ( bc => 0, de => 1, hl => 2, sp => 3 ); > sub token_reg16sp > { >   my $self = shift; >   return $REG16SP{ $self->token_kw( keys %REG16SP ) }; > } > > -- > > Paul Evans >
On Sat Mar 19 09:21:24 2011, pauloscustodio@gmail.com wrote: Show quoted text
> Thanks for your answer, and for the MFC module! > > I went for the more complex implementation just because it seamed > silly to grep on a list of keys of a hash (inside token_kw) when > calling exists on the hash key is much quicker. I will try your > solution and profile to see if the impact justifies the additional > code.
Ah yes, that is a point... An exists test would be nicer than a grep. Show quoted text
> In the application I was toying with I had to override token_int, > because I need to accept some more integer formats like 0FFh and $FF. > And again I needed to do the steps: > - skip_ws() > - grap pos() > - check at_eos() > - do my stuff > - if error go back to saved pos() and fail() > > It seams convenient to factor out all this code in a maybe-like > method, to be called by application specific token_xxx methods, e.g. > doit($code). I know the name is not good, just poped out of the head. > Any comments?
It's been a TODO comment of mine for a while now: # Easy ability for subclasses to define more token types I've observed that most of the token_foo methods are basically achievable by: return $convert->( $self->expect( $pattern ) ); My idea was to create a "token_generic" method, which takes such a matching pattern and conversion function. You could then implement your hex parser by: sub token_hex { my $self = shift; return $self->token_generic( pattern => qr/([0-9A-F]+)h/i, convert => sub { hex $_[1] }, ); } In fact, this could then be further neatened by a class method that creates the appropriate method sub and installs it in the package, wrapping it thus: __PACKAGE__->has_token( name => "hex", pattern => qr/([0-9A-F]+)h/i, convert => sub { hex $_[1] }, ); and it would create the method for you. Show quoted text
> Also, any reason to call skip_ws() inside at_eos() after grabbing > pos() instead of before? The effect is that the same white space is > skipped twice.
Ah yes. The reason that at_eos() pulls the pos() so early on, is that then it ensures that merely calling at_eos() doesn't skip the whitespace. This is essential for substring_before() to work correctly, or else it might miss whitespace contained within the substring. It wasn't intentional for the effect of skip_ws to be done twice though; I'll have a shuffle around of the code to see if that can be fixed. -- Paul Evans