Subject: | regex to remove entities not spec compliant |
Regex to remove entities is not spec compliant. The patch adds the
other possible characters to the matched character set.
http://www.w3.org/TR/2006/REC-xml-20060816/#NT-Name
I was testing trying to delete all the entities from a file and then the
output still had all the entities.
my $entity_list = $xTwig->entity_list();
foreach my $entity ($entity_list->list()) {
my $ent_name = $entity->name();
$entity_list->delete($ent_name);
}
open OUTFILE, ">$out_file";
print OUTFILE $xTwig->sprint( Update_DTD => 1 );
close OUTFILE;
Subject: | Twig.patch |
--- OrigTwig.pm 2006-09-18 14:23:48.000000000 -0400
+++ Twig.pm 2006-11-06 16:55:02.636673500 -0500
@@ -2591,7 +2591,7 @@ sub prolog
# awfull hack, but at least it works a little better that what was there before
if( $internal)
{ # remove entity declarations (they will be re-generated from the updated entity list)
- $internal=~ s{<! \s* ENTITY \s+ \w+ \s+ ( ("[^"]*"|'[^']*') \s* | SYSTEM [^>]*) >\s*}{}xg;
+ $internal=~ s{<! \s* ENTITY \s+ [\w\.\-\:]+ \s+ ( ("[^"]*"|'[^']*') \s* | SYSTEM [^>]*) >\s*}{}xg;
$internal=~ s{^\n}{};
}
$internal .= $t->entity_list->text ||'' if( $t->entity_list);