Subject: | HTML::Sanitizer security: XSS vulnerability, any configuration |
HTML::Sanitizer-0.01, tested under perl-5.008 on FreeBSD:
Any web application using HTML::Sanitizer to clean up untrusted HTML is likely to be vulnerable to cross site scripting attacks, because HTML::Sanitizer fails to escape < and > characters in non-tag text. HTML::Parser (and hence HTML::TreeBuilder) considers incomplete tags to be text, even it they contain < and/or > characters.
For example, if the input to HTML::Sanitizer is:
<img src="javascript:alert(1)"
then the output is just the same, even if HTML::Sanitizer is configured to reject all tags and attributes.
When a web application puts that text into an output page, you might get something like:
<hr>
<img src="javascript:alert(1)"
<hr>
and many browsers will see the > on the second <hr> as terminating the <img> tag, and run the javascript.
Demonstration:
#!/usr/local/perl-5.008/bin/perl -w
use strict;
use HTML::Sanitizer;
my $safe = new HTML::Sanitizer;
$safe->permit_only(
qw/ strong em /,
);
my $evil_html = '<img src="javascript:alert(1)"';
print $safe->filter_as_html_fragment($evil_html), "\n";