Subject: | HTML::Element's _xml_escape should be left to a filter that knows that the encodings involved are |
_xml_escape as applied by as_XML, called by Class::DBI::AsForm was causing data corruption during round tripping when unicode was involved.
My workaround was to assign an empty sub to _xml_escape.
My guess is that data was decoded as latin 1 or something by the browser (Despite meta http-equiv specifying utf-8, as well as the server agreeing with it WRT to the Content-Type header).
This data was then sent back to the server, but it was unicode reinterpreted as latin 1, converted into unicode, so wide characters were made into accented narrow ones from the latin 1 space.
Anyway, my point is that since HTML::Element has no control over where it's output data will be fed to eventually this should be an optional feature, that can be easily disabled or replaced, where another filter to replace unprintable characters can be applied to the string resulting from 'as_XML' by the output handler (for example a catalyst plugin, that hooks on output, or a special perl io mode).
Ciao, and thanks!