Skip Menu |

This queue is for tickets about the version CPAN distribution.

Report information
The Basics
Id: 50841
Status: resolved
Priority: 0/
Queue: version

People
Owner: jpeacock [...] cpan.org
Requestors: dagolden [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)



Subject: Problematic interpretation of multiple underscores
ActiveState Perl 5.11.1 has altered the $VERSION of some modules (notably ExtUtils::Install) to a form like this: $VERSION = '1.54_00_01'; Clearly, once run through eval(), this becomes 1.540001. However, version->new($VERSION) will die of an illegal alpha format due to multiple underscores. This prevents installation of Module::Build on ActiveState 5.11.1 as Module::Build::ModuleInfo calls Module::Build::Version, which dies. Arguably, while "v1.54_00_01" is clearly an illegal alpha, a more natural interpretation of "1.54_00_01" (without the leading-v) is that it is a decimal number, not an alpha v-string. -- David
Subject: Re: [rt.cpan.org #50841] Problematic interpretation of multiple underscores
Date: Wed, 28 Oct 2009 14:07:46 -0400
To: bug-version [...] rt.cpan.org
From: John Peacock <john.peacock [...] adatshalom.net>
Show quoted text
> ActiveState Perl 5.11.1 has altered the $VERSION of some modules
Do you mean 5.10.1? ActiveState is a) changing modules that they do not "own", and b) doing so without apparently testing anything. For example, this will probably fail with the same error: activeperl -e "use ExtUtils::Install 1.50" I'm not happy on both accounts, and wondering why we don't let ActiveState eat their own dogfood and fix it themselves. Show quoted text
> (notably ExtUtils::Install) to a form like this: > > $VERSION = '1.54_00_01'; > > Clearly, once run through eval(), this becomes 1.540001.
To be precise, bare numbers in Perl can have an arbitrary number of underscores (every other character in fact) and the tokenizer eats them. None of the Perl internals even "see" the underscores. This isn't a bare number however. The presence of a single underscore signifies an alpha/development release (which is what PAUSE has done for ages). The convention for alpha releases has been to have the first $VERSION in the file be quoted with an underscore (since only the first $VERSION assignment is tested via toolchain programs like M::B, EU::MM, and PAUSE's parser. By convention, the next line would be: $VERSION = eval($VERSION); which would have the effect of "washing the alpha away" so that what Perl will report when running that code would be the number sans underscores. Show quoted text
> However, version->new($VERSION) will die of an illegal alpha format due > to multiple underscores.
By design. In order to support alpha versions as different from non-alpha (or release) versions, I had to decide how to differentiate the two version types. The documented method was to allow a single underscore followed by digits only (i.e. no more decimals or underscores). This neatly allowed the same internal representation to be used for both - an array of integers - with the only difference between them whether the is_alpha flag was set. If the version is_alpha == 1, then the last array element is preceded by an underscore and every other array element by a period. Otherwise, the array elements are separated exclusively by decimals. I should always be possible to recreate the original version string from the array elements and flags (though in practice, the default stringification is the exact string that was used to create the version object[1]). Show quoted text
> Arguably, while "v1.54_00_01" is clearly an illegal alpha, a more > natural interpretation of "1.54_00_01" (without the leading-v) is that > it is a decimal number, not an alpha v-string.
To be pedantic, "1.54_00_01" is not a v-string (it only has a single decimal point). It is also not a decimal number _if it is quoted_. Apart from the usual difficulty that Perl 5.10.0 shipped with version.pm code that corresponds to the parsing I described above, I would have to completely revamp both the parser and object representation to support any other notation. As I see it, there are a couple of ways to proceed then: 1) tell ActiveState to stop messing with package $VERSION's for modules that aren't theirs (sadly not going to happen) or at the very least do so in a non-destructive way (which in this case would be to simply leave off the quotes. 2) augment the Module::Build::Version class to "try harder" iff it is running under ActivePerl and it catches that error. Actually, version.pm was designed to be easily subclassed, so it shouldn't be too hard to preparse the initializer string and clean it up before the parent class throws the error. I'm afraid I have to consider this a WONTFIX for version.pm, but I will do what I can to make it work for M::B anyways... John 1) the sole time that we won't have an original string is when a bare v-string is passed to qv(), ala $VERSION = qv(1.2.3); since there is no way to know /post facto/ whether there was a leading 'v' character or not. In this case, the stringification will always include a leading 'v'
On Wed Oct 28 14:08:08 2009, john.peacock@adatshalom.net wrote: Show quoted text
> Do you mean 5.10.1? ActiveState is a) changing modules that they do not > "own", and b) doing so without apparently testing anything.
Yes. 5.10.1. However, the issue isn't fixing AS but the broader question of how to interpret that kind of thing. Show quoted text
> The presence of a single underscore signifies an alpha/development > release (which is what PAUSE has done for ages). The convention for > alpha releases has been to have the first $VERSION in the file be quoted > with an underscore (since only the first $VERSION assignment is tested > via toolchain programs like M::B, EU::MM, and PAUSE's parser. By > convention, the next line would be: > > $VERSION = eval($VERSION);
And in this AS case, it is. Show quoted text
> > However, version->new($VERSION) will die of an illegal alpha format due > > to multiple underscores.
> > By design. In order to support alpha versions as different from > non-alpha (or release) versions, I had to decide how to differentiate > the two version types. The documented method was to allow a single > underscore followed by digits only (i.e. no more decimals or underscores).
And I'm saying that design decision was -- in hindsight -- wrong. Moreover, for reasons I've explained elsewhere, I think the whole idea of sorting alphas the way you ultimately chose to is wrong. Alpha is a concept that is orthogonal to $VERSION. That PAUSE and parts of the toolchain exploit the $VERSION = ... ; $VERSION = eval $VERSION trick should never have been allowed into core version semantics. If it had not, a whole raft of edge cases would have gone away. Show quoted text
> > Arguably, while "v1.54_00_01" is clearly an illegal alpha, a more > > natural interpretation of "1.54_00_01" (without the leading-v) is that > > it is a decimal number, not an alpha v-string.
> > To be pedantic, "1.54_00_01" is not a v-string (it only has a single > decimal point). It is also not a decimal number _if it is quoted_.
I'm not talking pedantry, I'm talking heuristics. There was a design decision taken to interpret that in a certain way. I'm suggesting that a more natural interpretation without a leading v and without multiple decimal points is as decimal. If you want to talk pedantry, I point out that PAUSE describes a developer release as matching /\d\.\d+_\d/, not the version component of a filename matching /\d\.\d+_\d+$/. So 'Foo-1.23_45_67.tar.gz' is a valid developer filename as far as PAUSE is concerned. Show quoted text
> Apart from the usual difficulty that Perl 5.10.0 shipped with version.pm > code that corresponds to the parsing I described above, I would have to > completely revamp both the parser and object representation to support > any other notation.
We don't have to change it for 5.10.X. We should change it for 5.11.X. I'm not agreed that the representation must change or just the parsing. But I'm not opposed. I think it would be a step forward to remove "alpha" as a concept from version objects entirely. Show quoted text
> 2) augment the Module::Build::Version class to "try harder" iff it is > running under ActivePerl and it catches that error. Actually, > version.pm was designed to be easily subclassed, so it shouldn't be too > hard to preparse the initializer string and clean it up before the > parent class throws the error.
I've already modified Module::Build::ModuleInfo to detect that form and just strip out the underscores before passing to Module::Build::Version. Show quoted text
> I'm afraid I have to consider this a WONTFIX for version.pm, but I will > do what I can to make it work for M::B anyways...
I disappointed to hear that you consider this 'WONTFIX'. You can leave M::B alone as I've already fixed this problem as described. -- David
Subject: Re: [rt.cpan.org #50841] Problematic interpretation of multiple underscores
Date: Wed, 28 Oct 2009 14:55:25 -0400
To: bug-version [...] rt.cpan.org
From: John Peacock <john.peacock [...] havurah-software.org>
On 10/28/2009 02:42 PM, David Golden via RT wrote: Show quoted text
> I disappointed to hear that you consider this 'WONTFIX'. You can leave > M::B alone as I've already fixed this problem as described.
It's not so much a WONTFIX but CANTFIX. I don't have any idea what you expect me to do. Am I to strip out the alpha parsing altogether and just skip underscores when parsing? Are you so sure that there isn't any code anywhere that is relying on the alpha parsing behavior? Hmmm, I just had a thought: would it be acceptable to still require that after an underscore, you could not have another decimal? I just thought of a way to extend the existing internal representation to support unlimited _'s and still keep the alpha code (more or less intact)... John
CC: module-build [...] perl.org
Subject: Re: [rt.cpan.org #50841] Problematic interpretation of multiple underscores
Date: Tue, 17 Nov 2009 22:34:47 -0500
To: bug-version [...] rt.cpan.org
From: John Peacock <john.peacock [...] havurah-software.org>
On 10/28/2009 02:55 PM, John Peacock via RT wrote: Show quoted text
> Hmmm, I just had a thought: would it be acceptable to still require that > after an underscore, you could not have another decimal? I just thought > of a way to extend the existing internal representation to support > unlimited _'s and still keep the alpha code (more or less intact)...
I have the pure-Perl implementation done for this support. You can now have unlimited underscores and it will DTRT. I need to do the XS implementation, as well as make it more resilient. This should really go into the built-in Module::Build::Version class too... John
It turns out that ActiveState didn't mean to include multiple underscores, according to Jan Dubois (who should know). I'll still release a package to support multiple underscores, but not for 5.10.2...
As it turns out, WONTFIX. The enhancements that went into version.pm starting in 0.80, i.e. the regular expressions for strict and lax versions, eliminate the possibility of multiple underscores (and the confusion they could cause).