Skip Menu |

This queue is for tickets about the PAR-Packer CPAN distribution.

Report information
The Basics
Id: 39233
Status: stalled
Priority: 0/
Queue: PAR-Packer

People
Owner: Nobody in particular
Requestors: dave_clarke [...] merck.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Suspected buffer overflow while running executable made by Par::Packer
Date: Fri, 12 Sep 2008 16:26:41 -0400
To: <bug-PAR [...] rt.cpan.org>
From: "Clarke, Dave S" <dave_clarke [...] merck.com>
I have encountered an error while running an executable made by Par::Packer. It occurs as part of a regexp pattern match. The error has been narrowed down to this snippet of code (problem bolded): $_ = q[TEXT="some_very_large_text_to_be_extracted_between_double_quotes"]; my $rQStr = qr/"((?:""|[^"])*)"/; # String between double quotes (") my $key = 'TEXT'; if (m/${key}=${rQStr}/) { $_ = $1; $_ at the start of the pattern match contains a large string (> 7000) characters. The perl script executes flawlessly, and has been in production for over a year. I recently distributed this script and associated modules to other users using Par::Packer. The .exe generated by Par::Packer works for strings up to 7261 characters, but fails silently at 7262 characters between the quotes. In other words, the .exe just exits w/o issuing any kind of error msg. Perl Version: This is perl, v5.8.8 built for MSWin32-x86-multi-thread Binary build 820 [274739] provided by ActiveState http://www.ActiveState.com Built Jan 23 2007 15:57:46 Module Versions (installed from bribes): PAR 0.982 PAR-Dist 0.31 PAR-Packer 0.982 OS Version: Microsoft Windows XP Professional Version 5.1.2600 Service Pack 2 Build 2600 If you need any additional information, or explanation of the problem. Please E-Mail, or call me using the info below. Dave Clarke Aker Solutions Representative at Merck & Co., Inc a& Co., Inc. - Business Confidential Phone: (215) 993-3015 Email: dave_clarke@merck.com Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as Banyu - direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.
Unfortunately, I'm unable to reproduce the problem. I could only check on linux, though. Would it be possible to provide an entire test script which fails if you package and run it? I'll try to dig up a working Virtual Machine for testing on win32. Best regards, Steffen
Subject: Re: [rt.cpan.org #39233] Suspected buffer overflow while running executable made by Par::Packer
Date: Sat, 13 Sep 2008 11:38:24 +0200
To: bug-PAR [...] rt.cpan.org
From: Steffen Mueller <wyp3rlx02 [...] sneakemail.com>
Hi again, Clarke, Dave S via RT wrote: Show quoted text
> Fri Sep 12 16:31:35 2008: Request 39233 was acted upon. > Transaction: Ticket created by dave_clarke@merck.com > Queue: PAR > Subject: Suspected buffer overflow while running executable made by Par::Packer
[...] Show quoted text
> $_ = > q[TEXT="some_very_large_text_to_be_extracted_between_double_quotes"]; > my $rQStr = qr/"((?:""|[^"])*)"/; > # String between double quotes (") > my $key = 'TEXT'; > if (m/${key}=${rQStr}/) { > $_ = $1;
It's likely that this isn't a PAR::Packer issue after all. That regex was triggering some alarms when I first saw it, but I wasn't entirely sure at the time. Since then, I talked to somebody who knows Perl regexes inside out and he simply called it a "diabolical pattern". It'll use one stack frame per input character. The details aren't clear to me, but it's pretty obvious that by replacing the regex with a less evil one, you'll fix the code. The quick suggestion was: my $rQStr = qr/"((?>(""|[^"]+)*))"/; He wasn't entirely sure it'd be right because it was way past end of work time for both of us. Best regards, Steffen
Subject: RE: [rt.cpan.org #39233] Suspected buffer overflow while running executable made by Par::Packer
Date: Mon, 15 Sep 2008 13:03:20 -0400
To: <bug-PAR [...] rt.cpan.org>
From: "Clarke, Dave S" <dave_clarke [...] merck.com>
Hi Steffen, Thanks for your quick response to this. This was work related, so I put it aside over the weekend. I did take a second look at the regexp. If you knew what I was trying to parse, you may not think it was quite so diabolocal. I was parsing some text, that included some strings in double quotes. The odd part is that, instead of escaping an embedded double quote with a back slash, they escape it with a second double quote. Therefore, I am dealing with a string like the following example. TEXT="this is a string with an ""embedded quotation"" in it". I decided it was much easier to parse, if I replaced the consecutive double qutoes with a control character, match a much simpler quoted string, then restore the embedded double quote. The example you sent looked like it was using an experimental feature [?>] -- or maybe I'm looking at old documentation. Anyways, I have a good solution to the problem for now. However, there was a difference between the way the interpreted perl code ran, and the .exe created by Par::Packer. If I can create a simple script, and .exe that I can forward to you, I will. Again, thanks for your help. Show quoted text
-----Original Message----- From: Steffen Mueller via RT [mailto:bug-PAR@rt.cpan.org] Sent: Saturday, September 13, 2008 9:39 AM To: Clarke, Dave S Subject: Re: [rt.cpan.org #39233] Suspected buffer overflow while running executable made by Par::Packer <URL: http://rt.cpan.org/Ticket/Display.html?id=39233 > Hi again, Clarke, Dave S via RT wrote:
> Fri Sep 12 16:31:35 2008: Request 39233 was acted upon. > Transaction: Ticket created by dave_clarke@merck.com > Queue: PAR > Subject: Suspected buffer overflow while running executable made
by Par::Packer [...]
> $_ = > q[TEXT="some_very_large_text_to_be_extracted_between_double_quotes"]; > my $rQStr = qr/"((?:""|[^"])*)"/; > # String between double quotes (") > my $key = 'TEXT'; > if (m/${key}=${rQStr}/) { > $_ = $1;
It's likely that this isn't a PAR::Packer issue after all. That regex was triggering some alarms when I first saw it, but I wasn't entirely sure at the time. Since then, I talked to somebody who knows Perl regexes inside out and he simply called it a "diabolical pattern". It'll use one stack frame per input character. The details aren't clear to me, but it's pretty obvious that by replacing the regex with a less evil one, you'll fix the code. The quick suggestion was: my $rQStr = qr/"((?>(""|[^"]+)*))"/; He wasn't entirely sure it'd be right because it was way past end of work time for both of us. Best regards, Steffen Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as Banyu - direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.
Subject: Re: [rt.cpan.org #39233] Suspected buffer overflow while running executable made by Par::Packer
Date: Mon, 15 Sep 2008 20:20:45 +0200
To: bug-PAR [...] rt.cpan.org
From: Steffen Mueller <wyp3rlx02 [...] sneakemail.com>
Hi Dave, Clarke, Dave S via RT wrote: Show quoted text
[...] Show quoted text
> I did take a second look at the regexp. If you knew what I was trying > to parse, you may not think it was quite so diabolocal.
Well, I wasn't considering your goal diabolical nor the means, but the specific regular expression which, according to the expert, uses a (C) stack frame per character which is a major issue. From what I know, it's quite possible that the condition is being triggered *both* with- and without PAR::Packer. In the PAR::Packer case, more code has been run, the whole script is essentially running inside an 'eval""'. Maybe the problem just manifests earlier in that case? I'm just speculating, though. Are you seeing the problem when you're running the generated executable on the exact same system or is it a different computer or OS installation? Show quoted text
> The example you sent looked like it was using an experimental feature > [?>] -- or maybe I'm looking at old documentation.
You're right. In 5.8.8, this is marked as "highly experimental". I don't have a 5.10.0 handy, but I suspect it's not experimental any more. In 5.11.0, that construct is absolutely not flagged as experimental any more! Given that the advice came from the person who wrote almost all of the improvements in the regexp engine for 5.10.0, he'd naturally use advanced feaetures. Show quoted text
> Anyways, I have a good solution to the problem for now. However, there > was a difference between the way the interpreted perl code ran, and the > .exe created by Par::Packer. If I can create a simple script, and .exe > that I can forward to you, I will.
Reporting the issue was entirely valid, no doubts. If you can produced a simple script, that would be much appreciated. I'll mark the issue as resolved, but a simple reply will reopen the ticket. Best regards, Steffen
Subject: RE: [rt.cpan.org #39233] Suspected buffer overflow while running executable made by Par::Packer
Date: Mon, 15 Sep 2008 17:09:32 -0400
To: <bug-PAR [...] rt.cpan.org>
From: "Clarke, Dave S" <dave_clarke [...] merck.com>
Steffen, I reduced the script to the following, which still demonstrates the problem. I am running both the script and the .exe on the same machine. By the way, I tried using the regexp that you supplied, and it runs fine in both the interpreted and compiled mode. It makes a lot more sense, when I take a few minutes to study it. use strict; use warnings; my $pre = q[TEXT="]; my $filler = q[A long string with an ""embedded quotation"" in it] x 10; my $post = q["]; my $key = 'TEXT'; my $rQStr1 = qr/"((?>(""|[^"]+)*))"/; # RE that Steffen recommends my $rQStr2 = qr/"((?:""|[^"])*)"/; # RE that crashes for my $loop (1..100) { $_ = $pre . ( $filler x $loop) . $post; my $size = length($_); if ( m/${key}=${rQStr1}/) { warn "Iteration $loop: RE1: String size $size, matched size " . length($1) . "\n"; } if ( m/${key}=${rQStr2}/) { warn "Iteration $loop: RE2: String size $size, matched size " . length($1) . "\n"; } } __END__ perl h.pl pp -o h.exe h.pl h.exe The results are interesting: The script starts generating warnings for $rQStr2 on loop 69-100. The .exe stops after loop 8. ... Iteration 8: RE1: String size 4007, matched size 4000 ... Iteration 68: RE1: String size 34007, matched size 34000 Iteration 68: RE2: String size 34007, matched size 34000 Iteration 69: RE1: String size 34507, matched size 34500 Complex regular subexpression recursion limit (32766) exceeded at h.pl line 19. Iteration 69: RE2: String size 34507, matched size 34122 ... Iteration 100: RE1: String size 50007, matched size 50000 Complex regular subexpression recursion limit (32766) exceeded at h.pl line 20. Iteration 100: RE2: String size 50007, matched size 34122 Environment: Perl Version: This is perl, v5.8.8 built for MSWin32-x86-multi-thread Binary build 820 [274739] provided by ActiveState http://www.ActiveState.com Built Jan 23 2007 15:57:46 Module Versions (installed from bribes): PAR 0.982 PAR-Dist 0.31 PAR-Packer 0.982 OS Version: Microsoft Windows XP Professional Version 5.1.2600 Service Pack 2 Build 2600 Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as Banyu - direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.
I'm marking this issue as stalled as it's a) been replaced with a better regexp, b) not clear how to "fix" it, but most importantly: b) still interesting and not understood. Best regards, Steffen