Skip Menu |

This queue is for tickets about the Regexp-Optimizer CPAN distribution.

Report information
The Basics
Id: 17831
Status: resolved
Priority: 0/
Queue: Regexp-Optimizer

People
Owner: Nobody in particular
Requestors: vincent [...] gandi.net
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 0.15
Fixed in: (no value)



Subject: optimized regexp problem
Hi, I'm using Regexp-Optimizer 0.15 with perl 5.8.7 under FreeBSD 6.0-RELEASE. I didn't dig much into the code to find what's the actual problem, however it's seems easy to reproduce: case 1, $o->optimize(qr/(?:1|2|3|4|5|6|7|8|9|10|11)/) returns '(?-xism:(?:1[01]?|[23456789]))', which is correct, it matches any of the digit. case 2, $o->optimize(qr/(?:1|2|3|4|5|6|7|8|9|10)/); returns '(?-xism:(?:1(?:)?|[23456789]))', which is not valid, it won't match 10, the regexp should be '(?-xism:(?:1(?:0)?|[23456789]))' Again, i didn't really look at the code, but i could only reproduce the bug when the last token of the alternation end with '0'. Vincent
From: Daphne.Pfister [...] genband.com
On Thu Feb 23 10:53:34 2006, guest wrote: Show quoted text
> Hi, > > I'm using Regexp-Optimizer 0.15 with perl 5.8.7 under FreeBSD 6.0-
RELEASE. Show quoted text
> I didn't dig much into the code to find what's the actual problem, > however it's seems easy to reproduce: > > case 1, $o->optimize(qr/(?:1|2|3|4|5|6|7|8|9|10|11)/) returns > '(?-xism:(?:1[01]?|[23456789]))', which is correct, it matches any of > the digit. > > case 2, $o->optimize(qr/(?:1|2|3|4|5|6|7|8|9|10)/); returns > '(?-xism:(?:1(?:)?|[23456789]))', which is not valid, it won't match
10, Show quoted text
> the regexp should be '(?-xism:(?:1(?:0)?|[23456789]))' > > Again, i didn't really look at the code, but i could only reproduce
the Show quoted text
> bug when the last token of the alternation end with '0'. > > Vincent
Looks to be a bug in handling character classes, see line 400 Regexp/List.pm, where if the character classes created is a single character just considered a zero it is dropped. I've attached a patch to which a quick fix for this.
Subject: RegexpList.patch
--- Regexp.orig/List.pm 2004-12-05 11:07:38.000000000 -0500 +++ Regexp/List.pm 2010-10-09 23:07:16.000000000 -0400 @@ -397,7 +397,7 @@ if (@char){ my $char = $self->_optim_cc(@char); splice @result, $charpos, 0, $char; - @result = grep {$_} @result; + @result = grep {$_ ne ""} @result; if (@result == 1){ $result = "$result[0]$q" and last RESULT; }
As of version 0.20 of Regexp::Optimizer now uses Regexp::Assemble instead of Regexp::List to optimize alteration. Dan