Bug #18472 for PAR-Packer: "parl foo.par script.pl ..." uses a cache area that doesn't depend on foo.par

Fri Mar 31 07:34:29 2006 RSCHUPP [...] cpan.org - Ticket created

Subject:

"parl foo.par script.pl ..." uses a cache area that doesn't depend on foo.par

Originally posted to the PAR mailing list (http://www.nntp.perl.org/group/perl.par/2293): From: Linn White I have a .par file stored on a network drive. All the computers running the par file are using the parl.exe executable. The files are cached on their pc and everything is working as it should. However, I have numerous additional files packed into the par. When I make changes to these files, recompile the par and place it on the network drive, other user cannot see the updated files, because the old files are being pulled from their local cache. Changes made to my scripts though are reflected. How do I make sure the additional files replace the existing chache files? Further post show that the application reads its data as actual files from the caching area (i.e. below $ENV{PAR_TEMP}) (as opposed to using Archive::Zip methods to read directly from the par archive). Now you would expect that the name of the cache area would be based on some checksum of the par archive and hence would get stale when the par is updated. But that's not in the case. In fact, every par "executed" via parl gets the same cache area, because its name is based on the checksum of the parl executable itself. It's not clear that this is a bug, since the caching behaviour of "parl foo.par ..." isn't specified anywhere. If you view "parl foo.par ..." as an abbreviation for "perl -MPAR=foo.par ..." you wouldn't expect that any cache whatsoever would be involved. In this light the use of parl as described above could be considered abuse. On the hand, you could view "parl foo.par ..." as being functionally equivalent to "pp -o foo.exe foo.par", followed by "foo.exe ...". Here a cache area is expected and it is definitely dependent on some checksum of foo.exe. I don't see an easy fix, here's what's happening: "parl foo.par script.pl" First par_mktmpdir (C function in myldr/mktmpdir.c) is called. It determines progname=.../parl, then uses the checksum of that file to construct the name $ENV{PAR_TEMP} and creates the directory if necessary. Then the Perl code in script/par.pl is run. It sets $progname=".../parl", detects that it is a PAR packed executable and extracts some files from the executable into $ENV{PAR_TEMP}. Next the code from package main in script/par.pl runs. By this time $progname has changed to "foo.par", but $ENV{PAR_TEMP} is already defined, so the checksum of foo.par doesn't matter. PAR->import is called and extracts the PAR archive into $ENV{PAR_TEMP}. Finally (with foo.par in @INC) script.pl is called via 'do "script.pl"'. "foo.exe ..." foo.exe is essentially the parl executable with foo.par piggy-backed. It goes thru the same motions as above, but $progname is always "foo.exe", hence the checksum (and $ENV{PAR_TEMP}) is based on foo.exe. Cheers, Roderich

Sat Apr 15 10:26:39 2006 smueller [...] cpan.org - Correspondence added

RT-Send-CC:

par [...] perl.org

Hi Roderich, thank you for the RT ticket. I have discussed the issue with Audrey and she thinks the cleanest solution would be to have parl sha1 any .par file it runs to find the correct cache. Another albeit non-preferred solution is to have parl not use caches at all. Still to be implemented. Takes welcome :) Steffen

Sat Apr 15 10:26:41 2006 The RT System itself - Status changed from 'new' to 'open'

Sat Apr 15 10:26:44 2006 smueller [...] cpan.org - Given to SMUELLER

Tue Apr 18 03:52:43 2006 RSCHUPP [...] cpan.org - Correspondence added

Show quoted text

> Still to be implemented. Takes welcome :)

Thanks for the clarification. The tricky thing for "parl foo.par ..." is that you have to (1) unpack parl.exe itself into the cache area first (otherwise parl would loose its feature to act as a standalone loader, i.e. you don't have to have PAR, Archive::Zip etc installed, even perl isn't needed) (2) unpack foo.par into the cache area but using a cache area name based on (2)... Cheers, Rodrich

Tue Apr 18 04:52:54 2006 smueller [...] cpan.org - Correspondence added

From:

smueller [...] cpan.org

On Di. 18. Apr. 2006, 03:52:43, RSCHUPP wrote: Show quoted text

> > Still to be implemented. Takes welcome :)

[sic], Takers. :) Show quoted text

> Thanks for the clarification. The tricky thing for > "parl foo.par ..." is that you have to > > (1) unpack parl.exe itself into the cache area first > (otherwise parl would loose its feature to act as a standalone > loader, i.e. you don't have to have PAR, Archive::Zip etc > installed, even perl isn't needed) > (2) unpack foo.par into the cache area > > but using a cache area name based on (2)...

Well, I had a look at the horrifying* C code. If you look at myldr/mktmpdir.c around lines 133 - 137. That code determines the parl (or pp-ed) executable file. That file is then sha1'd to get the name for the cache area. Now, when using parl instead of a pp-ed executable, there will be the name of the par archive to use with parl somewhere in argv[]. If we could modify lines 133-137 to determine the name (or rather: path) of the par archive, we could directly sha1 that. The problem I see is that one would need to a) reliably determine that parl is being used and not a pp-ed archive. b) reliably determine which argument represents the path to the par archive. As for a) Of course, it would be a moderately bad idea to assume that only the parl loader (and no pp-ed archive whatsoever) is named /^parl(?:\.exe|\.bin)$/i. But I cannot think of a better way to do it. As for b) I know there's various getopt libraries for C. I also found out some time that they vary from *nix to *nix and from *nix to win32, too. Ideas? * That's a property I attribute to all C code. Steffen

Tue Apr 18 10:40:07 2006 RSCHUPP [...] cpan.org - Correspondence added

Show quoted text

> Well, I had a look at the horrifying* C code. If you look at

Well, some of the Perl code is also horrifying :) In fact, large portions of the C code are duplicated in Perl in scripts/par.pl. The latter is somehow mogrified into parl.exe. I really can't tell whether the Perl or the C version of the code is called when you invoke parl.exe (maybe both?). You can't even explore par.pl by running it under the debugger because everything takes place inside a BEGIN block. Cheers, Roderich

Sun Jul 15 09:31:06 2007 smueller [...] cpan.org - Queue changed from PAR to PAR-Packer

Sun Jul 15 09:31:07 2007 smueller [...] cpan.org - Broken in 0.92 deleted

Sun Jul 15 09:31:35 2007 smueller [...] cpan.org - Broken in 0.975 added

Thu Sep 24 03:07:54 2009 TJC [...] cpan.org - Broken in 0.982 added

Thu Sep 24 03:07:54 2009 TJC [...] cpan.org - Broken in 0.991 added

Thu Sep 24 03:28:03 2009 TJC [...] cpan.org - Correspondence added

Ho ho ho. I just spent ages debugging /usr/bin/parl to arrive at this exact conclusion and came to CPAN to report it.. I attach a dirty patch that resolves it, but potentially breaks god- knows-what-else. Works for me..

diff --git a/myldr/mktmpdir.c b/myldr/mktmpdir.c index e15cc8b..6796a98 100644 --- a/myldr/mktmpdir.c +++ b/myldr/mktmpdir.c @@ -160,6 +160,14 @@ char *par_mktmpdir ( char **argv ) { if (progname == NULL) progname = argv[0]; + /* If running like /usr/bin/parl foo.par myscript.pl then argv[0] equals + * parl, and we don't want to base our checksum on that! Skip ahead one + * argument instead.. + */ + if (strstr(progname, "parl") && strstr(argv[1], ".par")) { + progname = argv[1]; + } + if ( !par_env_clean() && (f = open( progname, O_RDONLY | OPEN_O_BINARY ))) { lseek(f, -18, 2); read(f, buf, 6);

Thu Sep 24 09:34:23 2009 RSCHUPP [...] cpan.org - Correspondence added

On Thu Sep 24 03:28:03 2009, TJC wrote: Show quoted text

> I attach a dirty patch that resolves it,

Thanks for the patch, but... Show quoted text

> but potentially breaks god-knows-what-else.

...for starters, it breaks the test suite (and I think that can't be fixed). Sorry, if a solution would be that easy, we would have put it in a long time ago. I think that any solution should not meddle with the C code, but rather be implemented in par.pl. But par.pl also suffers from doing dual duty: being called from parl proper as well as from each pp-archive (second stage of bootstrapping). Cheers, Roderich

Thu Sep 24 22:00:39 2009 TJC [...] cpan.org - Correspondence added

On Thu Sep 24 09:34:23 2009, RSCHUPP wrote: Show quoted text

> On Thu Sep 24 03:28:03 2009, TJC wrote: > ...for starters, it breaks the test suite (and I think that can't > be fixed). Sorry, if a solution would be that easy, we would have > put it in a long time ago. I think that any solution should > not meddle with the C code, but rather be implemented in par.pl. > But par.pl also suffers from doing dual duty: being called > from parl proper as well as from each pp-archive (second stage > of bootstrapping).

Which bit of the test suite fails for you? I have it passing here with my patch.. Although it was notable that I had to do a "make realclean" and also remove the old system libraries first: /usr/share/perl5/PAR/StrippedPARL/Dynamic.pm /usr/share/perl5/PAR/StrippedPARL/Static.pm I agree the code isn't great, but it is a major bug that's been around for three and a half years, and it breaks parl badly. Any fix is better than no fix. If you let me know what is failing for you, I can try and resolve that.

Wed Sep 30 04:25:33 2009 RSCHUPP [...] cpan.org - Correspondence added

On Thu Sep 24 22:00:39 2009, TJC wrote: Show quoted text

> Which bit of the test suite fails for you? > I have it passing here with my patch.. Although it was notable that I > had to do a "make realclean" and also remove the old system libraries

first: Me bad, I should have known better. Now the test suite also passes for me and the behaviour seems as expected for the case of "parl foo.par bar.pl". (The only irritation was that cache directory used in this case was indeed different from the cache name "burned" into parl. But I expected it to be the SHA1 checksum of foo.par and it wasn't. Different bug.) I tightened your check for "am I invoked as 'parl foo.par ...'?" a bit (just paranoid), the only thing really necessary is to check that argv[1] is not NULL before passing it into a string function. Can you check whether the attached patch works for you?

diff -ubr PAR-Packer/myldr/Makefile.PL PAR-Packer-0.992_04/myldr/Makefile.PL --- a/myldr/Makefile.PL 2009-09-23 12:52:47.618181500 +0200 +++ b/myldr/Makefile.PL 2009-09-28 13:24:45.000000000 +0200 @@ -212,7 +212,7 @@ PERL=$perl LD=$ld CC=$cc -CFLAGS=$cflags +CFLAGS=$cflags -DPARL_EXE=\\"parl$exe\\" LDFLAGS=$Config{ldflags} PERL_LDFLAGS=$ldflags STATIC_LDFLAGS=$static_ldflags diff -ubr PAR-Packer/myldr/mktmpdir.c PAR-Packer-0.992_04/myldr/mktmpdir.c --- a/myldr/mktmpdir.c 2009-09-28 12:29:14.134471200 +0200 +++ b/myldr/mktmpdir.c 2009-09-30 09:57:07.188241200 +0200 @@ -160,6 +160,29 @@ if (progname == NULL) progname = argv[0]; + /* If invoked as "/usr/bin/parl foo.par myscript.pl" then progname should + * be ".../parl", and we don't want to base our checksum on that, but + * rather on "foo.par". + */ + { +#ifdef WIN32 +#define STREQ(a,b) (strcasecmp(a,b) == 0) +#else +#define STREQ(a,b) (strcmp(a,b) == 0) +#endif + int prog_len = strlen(progname); + int parl_len = strlen(PARL_EXE); + + if (prog_len >= parl_len + && STREQ(progname + prog_len - parl_len, PARL_EXE) + && (prog_len == parl_len || progname[prog_len - parl_len - 1] == dir_sep[0]) + && argv[1] + && strlen(argv[1]) >= 4 + && STREQ(argv[1] + strlen(argv[1]) - 4, ".par")) + progname = argv[1]; +#undef STREQ + } + if ( !par_env_clean() && (f = open( progname, O_RDONLY | OPEN_O_BINARY ))) { lseek(f, -18, 2); read(f, buf, 6);

Thu Oct 01 03:25:11 2009 TJC [...] cpan.org - Correspondence added

On Wed Sep 30 04:25:33 2009, RSCHUPP wrote: Show quoted text

> I tightened your check for "am I invoked as 'parl foo.par ...'?" > a bit (just paranoid), the only thing really necessary is > to check that argv[1] is not NULL before passing it into a string > function. > > Can you check whether the attached patch works for you?

The attached patch does still work for me. Thanks for improving it. The changes all look sensible.

Fri Nov 13 00:29:06 2009 TJC [...] cpan.org - Correspondence added

Any chance of this fix going through yet?

Fri Nov 13 03:23:07 2009 RSCHUPP [...] cpan.org - Correspondence added

Above patch checked into SVN.

Fri Nov 13 03:23:08 2009 RSCHUPP [...] cpan.org - Status changed from 'open' to 'resolved'