Subject: | HTML::CGIChecker - Nonsensically ampersand in HTML unicode is escaped |
Date: | Tue, 29 Sep 2009 22:20:01 +0200 |
To: | <bug-HTML-CGIChecker [...] rt.cpan.org> |
From: | "Detlef Pilzecker" <DetlefPilzecker [...] web.de> |
Hi,
I am refering to:
Tomas Styblo > HTML-CGIChecker-0.90 > HTML::CGIChecker
In the last sub:
----------------------------------------------------------
# Escapes some dangerous characters.
# Ampersand "&" is escaped only if it is not part of a HTML entity.
# Therefore, users can post HTML entities. Ampersands that are part
# of an ordinary text are still properly escaped.
# Thanks to godless@hermes.slipstream.com for this idea.
sub _html_escape {
my $self = shift;
my ($in) = @_;
for ($in) {
s/&(?!\w+;)/&/g;
s/>/>/g;
s/</</g;
s/"/"/g;
}
return $in;
}
----------------------------------------------------------
I found a bug:
The ampersand "&" is also escaped if it is part of a HTML unicode ( &#xxxx; )
To fix this replace
s/&(?!\w+;)/&/g;
with
s/&(?![#\w]+;)/&/g;
Regards
Detlef Pilzecker
Weitlahnerstaße 8
83209 Prien am Chiemsee