Subject: | HTML::Scrubber and UTF-8 |
Date: | Tue, 12 Sep 2006 01:58:38 +0200 |
To: | bug-HTML-Scrubber [...] rt.cpan.org |
From: | Frédéric Buclin <lpsolit [...] gmail.com> |
Hello,
While using HTML::Scrubber with pages encoded as UTF-8, I'm filling my
web server error log with the following message:
Parsing of undecoded UTF-8 will give garbage when decoding entities at
/usr/lib/perl5/site_perl/5.8.7/HTML/Scrubber.pm line 322.
To fix this problem, I wrote:
# Avoid filling the web server error log.
# In HTML::Scrubber 0.08, the HTML::Parser object is stored in
# the "_p" key, but this may change in future versions.
if (exists $scrubber->{_p} && $scrubber->{_p}->can('utf8_mode')) {
$scrubber->{_p}->utf8_mode(1);
}
But this looks like a big hack, especially because I'm playing with an
internal key.
Can you tell me a bit more about this warning? Is there a better way to
prevent these messages?
Thanks in advance,
Frédéric Buclin
QA Leader for the Bugzilla project
http://www.bugzilla.org