Subject: | PurePerl not setting the utf8 flag |
Two RSS feeds, both encoded in ISO-8859-1.
Feed One contains: a literal british pound character
<title>Extra £2.5m for July bomb victims</title>
Feed Two contains : a character entity reference
<title>Boots injects £3.6m to help area where it closed down
factory</title>
Feed Two, when parsed, returns a literal pound character (with
encode_entities --> £)
Feed One, when parsed, returns a UTF8 string which is not marked as
such, so encode_entities --> £
However, if (for feed One), you parse it, then
Encode::decode('utf8',$item->title), it interprets it correctly.
Sorry if that is confusing : essentially, it is returning UTF8
characters, but without the utf8 flag set.
The libXML parser works fine.