Subject: | HTML::Entities misses at least one Unicode (high bit) Character |
I think I've found a problem which causes HTML::Entities to miss an
entity when encoding (both numeric and normal).
I've attached a TGZ that includes a small snippet of malformed UTF8 and
a small test that demonstrates the problem. Here's how I'd show it:
% tar xvf missedentity.tgz
% ./go.pl > out
% vi out
The "out" file will contain:
Einar [Aacute]gú Frið
Of course, the [Aacute] should have been encoded.
I know this is easy to say, and very annoying, but given this entity is
missing, how many others may also be missing?
My system details:
Redhat Fedora 4
Perl 5.8.6
HTML::Parser 3.50
HTML::Entities 1.32
Subject: | missedentity.tgz |
Message body not shown because it is not plain text.