Skip Menu |

This queue is for tickets about the Module-ScanDeps CPAN distribution.

Report information
The Basics
Id: 106142
Status: resolved
Priority: 0/
Queue: Module-ScanDeps

People
Owner: Nobody in particular
Requestors: SLAFFAN [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: [Patch] Preload dependencies for PDL and PDL::NiceSlice
Attached is a patch to add preloads for PDL and PDL::NiceSlice. Without the preloads the code below fails after being packed using pp (unless the -x option is specified). Regards, Shawn. use PDL; use PDL::NiceSlice; my $x = pdl [[2,3,4],[1,2,3]]; print $x(1,); print $x;
Subject: ScanDeps.pm.patch
Index: lib/Module/ScanDeps.pm =================================================================== --- lib/Module/ScanDeps.pm (revision 1595) +++ lib/Module/ScanDeps.pm (working copy) @@ -430,6 +430,12 @@ _glob_in_inc('PDF/API2/Basic/TTF', 1); }, 'PDF/Writer.pm' => 'sub', + 'PDL.pm' => sub { @{ _get_preload('utf8.pm') } }, + 'PDL/NiceSlice.pm' => [ + 'PDL/NiceSlice/FilterUtilCall.pm', + 'PDL/NiceSlice/FilterSimple.pm', + 'PDL/NiceSlice/ModuleCompile.pm', + ], 'Perl/Critic.pm' => 'sub', #not only Perl/Critic/Policy 'PerlIO.pm' => [ 'PerlIO/scalar.pm' ], 'Pod/Usage.pm' => sub { # from Pod::Usage (as of 1.61)
On 2015-07-29 07:50:35, SLAFFAN wrote: Show quoted text
> use PDL; > use PDL::NiceSlice; > my $x = pdl [[2,3,4],[1,2,3]]; > print $x(1,); > print $x;
I agree on the %Preload rule for PDL/NiceSlice.pm, it can be made more robust as 'PDL/NiceSlice.pm' => 'sub', but the rule for PDL.pm needs further investigation. Somehow utf8_heavy.pl is needed... Maybe this was caused by rev 1501 in https://www.openfoundry.org/svn/par when I removed the scan rule that says Foo::Bar::quux(...) or Foo::Bar->quux(...) implies we should add a dependency on Foo::Bar. Cheers, Roderich
Thanks. The PDL::NiceSlice approach is much simpler. If I can get a chance I'll look into the PDL code to see what sort of calls are used. Regards, Shawn. On Wed Jul 29 11:23:33 2015, RSCHUPP wrote: Show quoted text
> On 2015-07-29 07:50:35, SLAFFAN wrote: >
> > use PDL; > > use PDL::NiceSlice; > > my $x = pdl [[2,3,4],[1,2,3]]; > > print $x(1,); > > print $x;
> > > I agree on the %Preload rule for PDL/NiceSlice.pm, it can be made more > robust as > > 'PDL/NiceSlice.pm' => 'sub', > > but the rule for PDL.pm needs further investigation. Somehow > utf8_heavy.pl is > needed... Maybe this was caused by rev 1501 in > https://www.openfoundry.org/svn/par > when I removed the scan rule that says > > Foo::Bar::quux(...) > > or > > Foo::Bar->quux(...) > > implies we should add a dependency on Foo::Bar. > > Cheers, Roderich > >
On Wed Jul 29 17:50:53 2015, SLAFFAN wrote: Show quoted text
> Thanks. The PDL::NiceSlice approach is much simpler. > > If I can get a chance I'll look into the PDL code to see what sort of > calls are used. > > Regards, > Shawn. > > > On Wed Jul 29 11:23:33 2015, RSCHUPP wrote:
> > On 2015-07-29 07:50:35, SLAFFAN wrote: > >
> > > use PDL; > > > use PDL::NiceSlice; > > > my $x = pdl [[2,3,4],[1,2,3]]; > > > print $x(1,); > > > print $x;
> > > > > > I agree on the %Preload rule for PDL/NiceSlice.pm, it can be made > > more > > robust as > > > > 'PDL/NiceSlice.pm' => 'sub', > > > > but the rule for PDL.pm needs further investigation. Somehow > > utf8_heavy.pl is > > needed... Maybe this was caused by rev 1501 in > > https://www.openfoundry.org/svn/par > > when I removed the scan rule that says > > > > Foo::Bar::quux(...) > > > > or > > > > Foo::Bar->quux(...) > > > > implies we should add a dependency on Foo::Bar. > > > > Cheers, Roderich > > > >
It looks like the change in rev 1501 is not the cause. I modified the regexp to use a negative lookbehind to avoid false positives on $foo->bar->baz, and added it into scan_chunk, but utf8_heavy.pl was not packed. I did not check for Foo::Bar::baz(), though. return $1 if (/(?<!\W) \b (\w+(?:::\w+)*) \s* (?:->)/x and $1 ne 'Tk' and $1 ne 'shift' and $1 ne '__PACKAGE__'); A bit of further searching through the PDL code indicated File::Map was a pinch-point, so I added some more Preload rules as a check (see below). These seem to work, but need further investigation since they are not generalised. File::Map is loaded in PDL::Core, PDL::IO::FlexRaw and PDL::IO::FastRaw using a string eval. From PDL::Core: eval 'use File::Map 0.47 qw(:all)'; As a quick experiment, I added the following preload rules to Module::ScanDeps. The explicit listing of unicore/Heavy.pl is needed because it is not found through the utf8.pm preload sub. I am sure there is a better way, but maybe this helps locate the source of the problem? 'File/Map.pm' => ['utf8.pm', 'unicore/Heavy.pl'], 'PDL' => ['PDL/Core.pm'], 'PDL/Core.pm' => ['File/Map.pm'], When I then run pp on the script below, explicitly loading File:Map, the packed executable works. use PDL; use File::Map; my $x = pdl [[2,3,4],[1, 2, 3]]; print $x; However, if I comment out the 'use File::Map' line it does not pack utf8_heavy.pl, so the preload rules above are clearly wrong. Instrumenting _glob_in_inc to print to stdout whenever unico[rd]e is passed as the subdir argument has no effect, so I assume the utf8.pm preload sub is not being run for the above preload rules. Hopefully the above is helpful. Regards, Shawn.
On 2015-08-02 00:48:42, SLAFFAN wrote: Show quoted text
> Instrumenting _glob_in_inc to print to stdout whenever unico[rd]e is > passed as the subdir argument has no effect, so I assume the utf8.pm > preload sub is not being run for the above preload rules.
Thanks for investigating. I tried to figure out at what point utf8_heavy.pl comes into play. For that I prepended this to your sample script BEGIN { # insert spy CODE into require's module lookup unshift @INC, sub { my ($self, $pm) = @_; print STDERR "# require $pm\n"; ($package, $filename, $line) = caller; print STDERR "# from $package ($filename:$line)\n"; return; # i.e. take a pass }; } This intercepts any (explicit or implicit) "require", prints out what is required and from where and then resumes "normal" processing. Here's the output # require PDL.pm # from main (/home/roderich/todo/PAR/Module-ScanDeps/shawn.pl:15) # require PDL/Core.pm # from main ((eval 1):6) # require PDL/Types.pm # from PDL::Core (/usr/lib/x86_64-linux-gnu/perl5/5.22/PDL/Core.pm:223) # require Carp.pm # from PDL::Types (/usr/lib/x86_64-linux-gnu/perl5/5.22/PDL/Types.pm:6) # require strict.pm # from Carp (/usr/share/perl/5.22/Carp.pm:4) # require warnings.pm # from Carp (/usr/share/perl/5.22/Carp.pm:5) # require Exporter.pm # from Carp (/usr/share/perl/5.22/Carp.pm:99) # require overload.pm # from PDL::Type (/usr/lib/x86_64-linux-gnu/perl5/5.22/PDL/Types.pm:428) # require overloading.pm # from overload (/usr/share/perl/5.22/overload.pm:83) # require warnings/register.pm # from overload (/usr/share/perl/5.22/overload.pm:144) # require Exporter/Heavy.pm # from Exporter (/usr/share/perl/5.22/Exporter.pm:16) # require PDL/Exporter.pm # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):314) # require DynaLoader.pm # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):315) # require Config.pm # from DynaLoader (/usr/lib/x86_64-linux-gnu/perl/5.22/DynaLoader.pm:21) # require vars.pm # from Config (/usr/lib/x86_64-linux-gnu/perl/5.22/Config.pm:11) # require Scalar/Util.pm # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1000) # require List/Util.pm # from Scalar::Util (/usr/lib/x86_64-linux-gnu/perl/5.22/Scalar/Util.pm:11) # require XSLoader.pm # from List::Util (/usr/lib/x86_64-linux-gnu/perl/5.22/List/Util.pm:21) # require utf8.pm # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1028) # require utf8_heavy.pl # from utf8 (/usr/share/perl/5.22/utf8.pm:16) # require re.pm # from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:4) # require unicore/Heavy.pl # from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:185) # require unicore/lib/Alpha/Y.pl # require PDL/Options.pm # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):3288) # require Fcntl.pm # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):4167) ... utf8.pm and the utf8_heavy.pl are actually loaded from PDL::Core.pm The funny "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)" is caused by the fact that PDL/Core.pm is a generated file with some # line 123 "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)" lines in it. And the offending line is if $value =~ /e\p{IsAlpha}/ or $value =~ /\p{IsAlpha}e/; There's no explicit mention of utf8.pm here - the code uses a Unicode property in a regular expression. utf8.pm (at least in Perl 5.22) doesn't do anything except setting up a AUTOLOAD sub that will require utf8_heavy.pl when being run. (If you check $utf8::AUTOLOAD when our @INC spy is called, it's value is "utf8::SWASHNEW".) So the whole utf8_heavy.pl + unico[dr]e shebang is triggered on demand whenever some Unicode feature of Perl is requested, e.g. a Unicode property in a regex, probably lots of others. I don't think it's feasible to try to detect this by statical analysis. Should we just add this stuff (at least 4 MB speread over more than 400 files) to _every_ packed executable? Cheers, Roderich
On Sun Aug 02 18:02:33 2015, RSCHUPP wrote: Show quoted text
> On 2015-08-02 00:48:42, SLAFFAN wrote:
> > Instrumenting _glob_in_inc to print to stdout whenever unico[rd]e is > > passed as the subdir argument has no effect, so I assume the utf8.pm > > preload sub is not being run for the above preload rules.
> > Thanks for investigating. I tried to figure out at what point > utf8_heavy.pl > comes into play. For that I prepended this to your sample script > > BEGIN > { > # insert spy CODE into require's module lookup > unshift @INC, sub > { > my ($self, $pm) = @_; > print STDERR "# require $pm\n"; > ($package, $filename, $line) = caller; > print STDERR "# from $package ($filename:$line)\n"; > return; # i.e. take a pass > }; > } > > This intercepts any (explicit or implicit) "require", prints out what > is required > and from where and then resumes "normal" processing. Here's the output > > # require PDL.pm > # from main (/home/roderich/todo/PAR/Module-ScanDeps/shawn.pl:15) > # require PDL/Core.pm > # from main ((eval 1):6) > # require PDL/Types.pm > # from PDL::Core (/usr/lib/x86_64-linux- > gnu/perl5/5.22/PDL/Core.pm:223) > # require Carp.pm > # from PDL::Types (/usr/lib/x86_64-linux- > gnu/perl5/5.22/PDL/Types.pm:6) > # require strict.pm > # from Carp (/usr/share/perl/5.22/Carp.pm:4) > # require warnings.pm > # from Carp (/usr/share/perl/5.22/Carp.pm:5) > # require Exporter.pm > # from Carp (/usr/share/perl/5.22/Carp.pm:99) > # require overload.pm > # from PDL::Type (/usr/lib/x86_64-linux- > gnu/perl5/5.22/PDL/Types.pm:428) > # require overloading.pm > # from overload (/usr/share/perl/5.22/overload.pm:83) > # require warnings/register.pm > # from overload (/usr/share/perl/5.22/overload.pm:144) > # require Exporter/Heavy.pm > # from Exporter (/usr/share/perl/5.22/Exporter.pm:16) > # require PDL/Exporter.pm > # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):314) > # require DynaLoader.pm > # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):315) > # require Config.pm > # from DynaLoader (/usr/lib/x86_64-linux- > gnu/perl/5.22/DynaLoader.pm:21) > # require vars.pm > # from Config (/usr/lib/x86_64-linux-gnu/perl/5.22/Config.pm:11) > # require Scalar/Util.pm > # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1000) > # require List/Util.pm > # from Scalar::Util (/usr/lib/x86_64-linux- > gnu/perl/5.22/Scalar/Util.pm:11) > # require XSLoader.pm > # from List::Util (/usr/lib/x86_64-linux- > gnu/perl/5.22/List/Util.pm:21) > # require utf8.pm > # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1028) > # require utf8_heavy.pl > # from utf8 (/usr/share/perl/5.22/utf8.pm:16) > # require re.pm > # from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:4) > # require unicore/Heavy.pl > # from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:185) > # require unicore/lib/Alpha/Y.pl > # require PDL/Options.pm > # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):3288) > # require Fcntl.pm > # from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):4167) > ... > > utf8.pm and the utf8_heavy.pl are actually loaded from PDL::Core.pm > The funny "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)" is caused by the > fact > that PDL/Core.pm is a generated file with some > > # line 123 "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)" > > lines in it. And the offending line is > > if $value =~ /e\p{IsAlpha}/ or $value =~ /\p{IsAlpha}e/; > > There's no explicit mention of utf8.pm here - the code uses a Unicode > property > in a regular expression. utf8.pm (at least in Perl 5.22) doesn't do > anything > except setting up a AUTOLOAD sub that will require utf8_heavy.pl when > being run. > (If you check $utf8::AUTOLOAD when our @INC spy is called, it's value > is "utf8::SWASHNEW".) > > So the whole utf8_heavy.pl + unico[dr]e shebang is triggered on demand > whenever > some Unicode feature of Perl is requested, e.g. a Unicode property in > a regex, > probably lots of others. > > I don't think it's feasible to try to detect this by statical > analysis. > Should we just add this stuff (at least 4 MB speread over more than > 400 files) > to _every_ packed executable? > > Cheers, Roderich
Thanks Roderich, The size issue rears its head once more... It would also be a Herculean task to get static scanning to detect all such cases (although maybe PPI could be leveraged if someone ever has the tuits - https://metacpan.org/pod/PPI::Token::Regexp ). Perhaps another flag could be added to pp for the cases where the code does not explicitly call for unicode, but it is needed for a packed executable to work. pp --unicode? I also now think that this is the root cause of an issue I've been working around for a while using the code below. I use the pp -x flag when building, and set an environment variable in my script before calling pp. if ($ENV{BDV_PP_BUILDING}) { use 5.016; use feature 'unicode_strings'; my $string = "sp_self_only() and \N{WHITE SMILING FACE}"; $string =~ /\bsp_self_only\b/; } Given that, it should be possible to statically scan for the various permutations of /use feature 'unicode_/ to detect unicode_strings and unicode_eval. If someone is using those features in their code then they need the extra libraries. https://metacpan.org/pod/feature#The-unicode_strings-feature Such scanning would not detect multiline chunks, as per the documentation caveats. A "pp -unicode" style flag would still be needed in such cases. https://metacpan.org/pod/Module::ScanDeps#CAVEATS WRT the pp flag, maybe a more general approach would be something that parallels the feature pragma, e.g. pp --feature=unicode_strings,unicode_eval pp --feature=":5.12" Regards, Shawn.
Better late than never... The %Preload rule for PDL::NiceSlice was added in Module::ScanDeps 1.20. The --unicode option for pp was added in PAR::Packer 1.29. Chers, Roderich