Subject: | A workaround to keep the original character encoding (accents, etc.) |
About the purpose of this bug report -which relates specially to the documentation-, if there is not yet another better fix for this, please add the following workaround to the XML::RSS documentation, so that users can save a lot of time when managing international characters (accents, etc.). It took me many tests to find this solution.
Under some circumstances (this problem happened to me on a server, but not on another), XML::RSS 1.05 (the current version) converts our RSS feed text into Unicode, making it difficult to read if you place it on html pages with a different encoding such as ISO-8859-1.
Before, it has been suggested to use $rss = new XML::RSS (encoding=>"ISO-8859-1"); but I tested it several times and it didn't make any difference for me: the same almost unreadable results, etc.
In short, the very simple solution was to convert all the "&" in the feed into their entity & (or the hexadecimal &) BEFORE parsing the feed with the XML::RSS module. A way to do this conversion in Perl is of course:
$feed_content =~ s/\&/\&\#038\;/g;
Only this conversion is needed, the rest is done by XML::RSS. In this way, for example we are in fact transforming the Spanish accented "a" (á) into á and the & is recovered (converted) by XML::RSS from & into & again. That is to say, we keep the original encoding of the Spanish accented characters, etc. In this example, the steps are: á --> á --> á
This solved the issue completely for me. Hope this helps.