On Oct 23, 2006, at 2:14, Stephen Hall via RT wrote:
Show quoted text> HTML::Entities::encode_entities_numeric should be used for entity
> encoding, instead of HTML::Entities::encode_entities. The reason is
> that XML allows only 5 named entities - & < > "
> ' -
> all other entities must be encoded numerically. This means XML::RSS
> v1.11 produces illegal XML when encoding entities other than the 5
> above.
Doh! I'll make a 1.12 with that fixed.
It should be okay to use named entities in CDATA fields, right?
- ask
Index: lib/XML/RSS.pm
===================================================================
--- lib/XML/RSS.pm (revision 7967)
+++ lib/XML/RSS.pm (working copy)
@@ -2,7 +2,7 @@
use strict;
use Carp;
use XML::Parser;
-use HTML::Entities qw(encode_entities);
+use HTML::Entities qw(encode_entities_numeric encode_entities);
use vars qw($VERSION $AUTOLOAD $modules $AUTO_ADD);
use base qw(XML::Parser);
@@ -1684,9 +1684,11 @@
my $encoded_text = '';
while ( $text =~ s/(.*?)(\<\!\[CDATA\[.*?\]\]\>)//s ) {
+ # we use &named; entities here because it's HTML
$encoded_text .= encode_entities($1) . $2;
}
- $encoded_text .= encode_entities($text);
+ # we use numeric entities here because it's XML
+ $encoded_text .= encode_entities_numeric($text);
return $encoded_text;
}