Skip Menu |

This queue is for tickets about the Apache-LogRegex CPAN distribution.

Report information
The Basics
Id: 107511
Status: resolved
Priority: 0/
Queue: Apache-LogRegex

People
Owner: SPACEBAT [...] cpan.org
Requestors: Peter [...] PSDT.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: chomp() should go back to parse()
Date: Fri, 2 Oct 2015 11:17:03 -0700
To: bug-Apache-LogRegex [...] rt.cpan.org
From: Peter Scott <Peter [...] PSDT.com>
Recent versions removed chomp($line) from parse(). This appears to be a mistake. Consider the end of a regex generated for a format that ends in "%H %V %{SSL_PROTOCOL}x %D": /(.+?)\s+(\S*)\s+(\S*)\s+(\S*)\s*/ Now match that against a chomped vs nonchomped string: $_ = 'a b c '; ### TWO spaces after 'a' print Dumper [ /(.+?)\s+(\S*)\s+(\S*)\s+(\S*)\s*/ ]; chomp; print Dumper [ /(.+?)\s+(\S*)\s+(\S*)\s+(\S*)\s*/ ]; $VAR1 = [ 'a', 'b', 'c', '' ]; $VAR1 = [ 'a', '', 'b', 'c' ]; The last \s+ matches newline.
Hi Peter, I see your point, but I don't think the chomp is the problem. Its the change to cope with multiple spaces \s+, which as you noted matches what it shouldn't, and is ambiguous in the presence of unquoted elements that can be empty (\S*). I'm planning on changing the multiple space handling so that it applies only where its not ambiguous (eg in between quoted elements). Thanks for the report, I should have a commit for you to test soon. On Fri Oct 02 14:17:39 2015, Peter@PSDT.com wrote: Show quoted text
> Recent versions removed > > chomp($line) > > from parse(). This appears to be a mistake. Consider the end of a > regex generated for a format that ends in "%H %V %{SSL_PROTOCOL}x %D": > > /(.+?)\s+(\S*)\s+(\S*)\s+(\S*)\s*/ > > > Now match that against a chomped vs nonchomped string: > > $_ = 'a b c > '; > ### TWO spaces after 'a' > print Dumper [ /(.+?)\s+(\S*)\s+(\S*)\s+(\S*)\s*/ ]; > chomp; > print Dumper [ /(.+?)\s+(\S*)\s+(\S*)\s+(\S*)\s*/ ]; > > > $VAR1 = [ > 'a', > 'b', > 'c', > '' > ]; > $VAR1 = [ > 'a', > '', > 'b', > 'c' > ]; > > The last \s+ matches newline. > >
A commit that fixes this issue and tests for it is in the repo now. A developer release 1.70_1 has been uploaded to the CPAN for now in case anyone wants to look at it. Will release as 1.71 in a few days if it works out OK.
Version 1.71 released which should fix the issue.