Subject: | Possible bug reading XML with escaped brackets reads them as CDATA |
Date: | Fri, 22 Nov 2013 13:42:29 +0900 |
To: | bug-XML-Smart [...] rt.cpan.org |
From: | D <dilots [...] gmail.com> |
Hello,
I recently tried to parse an xml file that has entries like:
<ANNOTATION_VALUE><unk></ANNOTATION_VALUE>
<TIER LINGUISTIC_TYPE_REF="default-lt" TIER_ID="<Target>"/>
Unfortunately, XML::Smart saw these and interpreted them are CDATA tags.
When I save the xml data, it turns into this:
<ANNOTATION_VALUE><![CDATA[<unk>]]></ANNOTATION_VALUE>
<TIER LINGUISTIC_TYPE_REF="default-lt">
<TIER_ID><![CDATA[<Target>]]></TIER_ID>
</TIER>
(Note that Tier ID goes from being an attribute to being an element,
another possible bug?)
I tried to stop this by setting set_cdata(false) on all the nodes, but it
didn't change anything:
$xml->{TIER}{TIER_ID}->set_cdata(0);
$xml->{ANNOTATION_VALUE}->set_cdata(0);
In the end I switched to LibXML, where everything worked out of the box.
Cheers,
David Cummings