Subject: | entity escaping makes error in specific cases eg. R&D; |
Due to a logical error certain &'s are not escaped properly, for example in the string "R&D;" the ampersand is not escaped. I propose to fix this by changing the encode_text-sub to the following:
sub encode_text {
my $text = shift;
$text =~ s/&($entities);/$entity{$1}/g;
$text =~ s/&(?!(#[0-9]+|#x[0-9a-fA-F]+);)/&/g;
$text =~ s/</</g;
return $text;
}
As you can see the known entities are replaced first now, after which only hexadecimal entities should exist and all the other ampersands can be escaped safely.
Kind regards,
Matthijs Mullender