Subject: | UTF-8 transformation is broken with UTF-8 data |
When CGI parameters contain UTF-8 character data, CGI::XML attempts to
encode it to UTF-8. This probably pre-dates Perl native UTF-8 support,
and it is incompatible with it. Example:
perl -MCGI -MCGI::XML -e '$q = CGI->new({ param=>"\x{017D}" }); print
CGI::XML::toXML($q), "\n"
'
Wide character in print at -e line 1.
<param>Ž</param>
One solution would be to use Encode::encode('utf-8', $data) to encode
all characters to UTF-8 bytes, but I would prefer CGI::XML to keep the
UTF-8 characters untouched, and let the caller encode them to the output
encoding, possibly using character-aware filehandle with PerlIO encoding
layer.
I would suggest to remove the last substitution in QuoteXMLChars, and
delete then-unused function XmlUtf8Encode.