Skip Menu |

This queue is for tickets about the HTML-Clean CPAN distribution.

Report information
The Basics
Id: 1874
Status: new
Priority: 0/
Queue: HTML-Clean

People
Owner: Nobody in particular
Requestors: rkinyon [...] columbus.rr.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: Bug when removing comments
HTML-Clean-0.8 Perl 5.005_03 Solaris 5.8 The bug is in _jscomments(), in the first regex. The diff would look like: 213c213 < $js =~ s,\n\s*//.*?\n,\n,sig; --- Show quoted text
> $js =~ s,\n(\s*//.*?\n)+,\n,sig;
A good test case would be: asdf // sdfg // dfhj asdf What happens is that the version in 0.8 doesn't remove the second line of comments. The patch will. The bug is a classic "ababa" regex. If you want to convert all 'aba' to 'a', the obvious regex is s/aba/a/g. However, the second 'a' in the first 'aba' is the first 'a' in the second 'aba'. But, because that 'a' has already been matched against, the regex pointer will be at the second 'b' in 'ababa'. That doesn't match s/aba/a/g, so you're left with 'ababa' => 'aba' instead of 'ababa' => 'a'. The fix is to do s/a(ba)+/a/g. I hope this explains the bug. The reason why I marked it as important is the following javascript example: function foo { var FooDoes = 0; // if (FooDoes) // { // alert('Nothing'); // } } If you use HTML::Clean with both comments and javascript turned on, this will result in a snippet that looks like: function foo {var FooDoes = 0;// }} As you see, the closing brace to the function foo() is commented out. If you have any comments/questions, please email me at rkinyon@columbus.rr.com Thank you.