The problem you are encountering is that R::A uses a simple lexer to
chop up each pattern with a (documented :) limitation: it fails to pull
apart patterns containing nested parentheses correctly, and the patterns
you are feeding it do contain nested parens: the trailing ZWLA (?=...)
is nested within a (?i...).
If I change the script a bit we have:
#! perl -w
use Regexp::Assemble;
my $re = Regexp::Assemble->new->debug(1);
while(<DATA>) {
chomp;
/\S/ or next;
tr(/\\)(/)s;
$_ = quotemeta;
s((?<!\\/)$)((?=\\/));
s(\\/)([\\\\/])g;
$_ = "^(?i:$_)";
$re->add( $_ );
}
print $re->as_string, "\n";
__DATA__
C:\W
F:\M
this produces:
_insert_path [^ (?i:C\:[\\/]W(?=[\\/]) )] into []
at path ()
added remaining [^ (?i:C\:[\\/]W(?=[\\/]) )]
_insert_path [^ (?i:F\:[\\/]M(?=[\\/]) )] into [^ (?i:C\:[\\/]W(?=[\\/]) )]
at path (off=<^> (?i:C\:[\\/]W(?=[\\/]) ))
at path (^ off=<(?i:C\:[\\/]W(?=[\\/])> ))
token (?i:F\:[\\/]M(?=[\\/]) not present
result=^(?:(?i:C\:[\\/]W(?=[\\/])|(?i:F\:[\\/]M(?=[\\/])))
--
The main thing to note is that the pattern was tokenised as
^
(?i:C\:[\\/]W(?=[\\/])
)
Both patterns will reduce and share the ^ and trailing ), leaving the
two inner fragments with unbalanced parens. Hence the error.
The main problem comes with the wrapping of the patterns in an
all-encompassing (?i...)
If you could munge the strings so as to arrive at, e.g.:
^(?i:C\:[\\/]W)(?=[\\/])
It would be tokenised correctly since the parens are no longer nested:
^
(?i:C\:[\\/]W)
(?=[\\/])
Now the trailing (?=[\\/]) would be shared among all the source
patterns, which will make for a smaller regexp.
You should be able to use the flags('i') method to set the /i flag
globally for the whole pattern (although it is true that the flag method
was ignored for tracked patterns for all versions prior to 0.16).
Getting rid of the (?i...) wrapper has another benefit: if you have two
paths C:/X and C:/Y (keeping path separators out the picture to simplify
the issue), the resulting pattern will not be C:/[XY], but rather
(?iC:/X)|(?iC:/Y). When testing a string such as C:/Z, it will have to
inspect both alternations before concluding that a match is impossible,
instead of walking down the one pattern and failing at the character class.