Subject: | inappropriate character recognized in variable names |
In Config::General 2.63, Config::General::Interpolated contains the
following as part of a regex for finding variable names:
([a-zA-Z0-9_\-\.:\+,]+) # $3: capturing variable name (fix of #33447)
Trouble is, unless I missed it, the set of valid characters in a
variable name doesn't seem to be defined anywhere in the Config::General
documentation. Most of the examples show only simple alphabetic names
being used.
I believe the fix for #33447 went too far. Before that, the regex
only contained:
([a-zA-Z_]\w*) # $3: capturing variable name
In particular, the regex now recognizes very unexpected punctuation as
valid characters in a variable name. #33447 only asked for recognition
of the "-" character. The actual changes went much further. Not only
does the new regex expand the set of valid characters, it also no longer
restricts the initial character in the variable name to being one of a
limited set. Meanwhile, the Changelog says only:
2.38
- fixed rt.cpan.org#33447, regex to catch variable
names were too strict, now - . + or : are allowed too.
My particular irritation is with including a comma as a valid character
in a variable name. It's not mentioned in the ChangeLog entry, I cannot
see that anyone would reasonably expect to use it as such, and it breaks
interpolation in simple uses like this:
log4perl.category.My.Application = $foo, bar
not to mention if I had used
log4perl.category.My.Application = $foo,bar
instead.
I now know that this can be worked around with the ${foo} syntax, but for
an appended comma, that should not be necessary. And it's complicated
to debug, to even find out what's going on. One gets an error message
similar to this:
Use of uninitialized variable ($foo,) while loading config entry
The clue is there, but since there is no common expectation that a comma
might be part of a variable name, it's far too easy to overlook that
clue or to misinterpret the message as just mirroring the surrounding
text in the config entry instead of representing the full variable name
it's looking for.
To summarize, I believe that the comma character should be removed from
the regex. That's my main concern. Perhaps also, the initial-character
issue should be examined as well. And in any case, the Config::General
doc needs a paragraph or two to describe the full naming convention.