Skip Menu |

This queue is for tickets about the HTML-StripScripts-Parser CPAN distribution.

Report information
The Basics
Id: 6123
Status: resolved
Priority: 0/
Queue: HTML-StripScripts-Parser

People
Owner: Nobody in particular
Requestors: shlomif [...] iglu.org.il
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: H::SS::P Strips local-URLed href="" Attributes Even When AllowHref and AllowSrc are Specified
When the following script: <<< #!/usr/bin/perl -w use strict; use HTML::StripScripts::Parser; sub strip_file { my $filename = shift; my $hss = HTML::StripScripts::Parser->new( { Context => 'Document', AllowSrc => 1, AllowHref => 1, }, ); $hss->parse_file($filename); local (*O); open O, ">$filename.js-less"; print O $hss->filtered_document(); close(O); } my $filename = shift; strip_file($filename); Show quoted text
>>>
Is ran against the following file: <<< <html> <body> <p> <a href="../ab.html">Shlomi Fish' Homepage</a> </p> </body> </html> Show quoted text
>>>
Where ../ab.html is any local URL. The href="" part of the URL is stripped despite the fact that it does not contain any malicious code. It is a simple local link. I'm using HTML-StripScripts-Parser-0.06 with HTML-StripScripts-0.02. I'm using perl 5.8.3-5mdk on: <<< Linux localhost.localdomain 2.6.3-8mdksmp #1 SMP Sat Apr 3 06:38:20 MST 2004 i686 unknown unknown GNU/Linux Show quoted text
>>>
(Mandrake 10.0 on a Pentium 4) Alternatively, I am able to reproduce this problem with perl 5.8.0 (vanilla) running on an old RedHat 6.2 box, so I don't think this problem is global.
Workaround: override the method that checks URL validity. Example attached
#!/usr/bin/perl -w use strict; { package My::StripScripts; use base qw(HTML::StripScritps::Parser); sub validate_href_attribute { my ($self, $text) = @_; $self->SUPER::validate_href_attribute or $text =~ m#^[\.\/\w]{1,100}$#; } } sub strip_file { my $filename = shift; my $hss = My:StripScripts->new( { Context => 'Document', AllowSrc => 1, AllowHref => 1, }, ); $hss->parse_file($filename); local (*O); open O, ">$filename.js-less"; print O $hss->filtered_document(); close(O); }