Skip Menu |

This queue is for tickets about the XML-Bare CPAN distribution.

Report information
The Basics
Id: 75220
Status: resolved
Priority: 0/
Queue: XML-Bare

People
Owner: cpan [...] codechild.com
Requestors: Support [...] RoxSoft.co.uk
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.47
Fixed in: 0.48



Subject: Entity Handling
The change in entity handling between 0.45 and 0.47 has introduced a backward compatibility error. The entity & used to be returned unmodified, now it is converted to &. Consequently a file containing & no longer round trips as it did in the 0.45 release
Please could you give me a little detail. Yes, the handling has changed between 0.45 and 0.47, but the old handling was broken, and the new handling should round trip correctly - ie 0.45 in: <data><value>&amp;</value></data> parse then xml dump out: <data><value><![CDATA[&amp;]]></value></data> which is equiv to <data><value>&amp;amp;</value></data> 0.47 in: <data><value>&amp;</value></data> parse then xml dump out: <data><value><![CDATA[&]]></value></data> which is equivalent XML A test program showing the issue you see would be useful.
See attached, but nevermind I've already coded around it
Subject: data.xml
<config> <fields> <sep>&amp;nbsp;&amp;nbsp;&amp;nbsp;</sep> </fields> </config>
Subject: xml_bare.t
use strict; use warnings; use File::Copy; use Test::More; use Text::Diff; use XML::Bare; plan tests => 1; copy 'data.xml', 'data1.xml'; my $obj = XML::Bare->new( file => 'data1.xml' ); my $root = $obj->parse(); $obj->save(); my $diff = diff 'data.xml', 'data1.xml'; ok( !$diff, 'Load and dump roundtrips' ); # Local Variables: # mode: perl # tab-width: 3 # End:
Subject: Re: [rt.cpan.org #75220] Entity Handling
Date: Fri, 24 Feb 2012 17:58:50 -0500
To: bug-XML-Bare [...] rt.cpan.org
From: David Helkowski <livxtrm [...] codechild.com>
Equivalent is not same. It should be the same. Make the new behavior an option please. It is not desirable to change the behavior. Nigel Metheringham via RT <bug-XML-Bare@rt.cpan.org> wrote: Show quoted text
> Queue: XML-Bare > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=75220 > > >Please could you give me a little detail. > >Yes, the handling has changed between 0.45 and 0.47, but the old handling >was broken, and the new handling should round trip correctly - ie > > >0.45 > in: <data><value>&amp;</value></data> > parse then xml dump > out: <data><value><![CDATA[&amp;]]></value></data> > > which is equiv to <data><value>&amp;amp;</value></data> > >0.47 > in: <data><value>&amp;</value></data> > parse then xml dump > out: <data><value><![CDATA[&]]></value></data> > > which is equivalent XML > > >A test program showing the issue you see would be useful. >
On Fri Feb 24 17:58:49 2012, livxtrm@codechild.com wrote: Show quoted text
> Equivalent is not same. It should be the same. Make the new behavior > an option please. It is not desirable to change the behavior.
It has *never* been the same for any XML input containing a character within [<>&;] as that triggers CDATA output in obj2xml. In versions prior to 0.47 there was no translation on parsing, so the output was neither identical or equivalent if there were any XML escapes in the input. In 0.47 output is equivalent for XML escapes, but it does not handle any other entities - mainly because it gets too damn difficult to work out the correct behaviour - ie (X)HTML different to straight XML. Can certainly make it optional. Wondering if there also needs to be an option for full entity decode (probably by allowing a callout).
Subject: Re: [rt.cpan.org #75220] Entity Handling
Date: Sat, 25 Feb 2012 17:45:15 -0500
To: bug-XML-Bare [...] rt.cpan.org
From: David Helkowski <livxtrm [...] codechild.com>
There has been a longstanding issue of entities being handled in a confusing way. I agree the new approach is better, the problem is that people have coded around the old behavior, and the new change will break their old workarounds. Full entity decoding should be done in some sort of efficient fashion in c. You are free to hack it in some other way though, just make sure to make it optional and thoroughly document how it works. It is the same if you had cdata to start with. I don't believe ampersand by itself triggered cdata. There is no need from my point of view. Note that the specs say otherwise, but the module has never followed specs, nor should it. Do as you wish though, just don't be surprised if you change default behavior and people complain. Nigel Metheringham via RT <bug-XML-Bare@rt.cpan.org> wrote: Show quoted text
> Queue: XML-Bare > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=75220 > > >On Fri Feb 24 17:58:49 2012, livxtrm@codechild.com wrote:
>> Equivalent is not same. It should be the same. Make the new behavior >> an option please. It is not desirable to change the behavior.
> >It has *never* been the same for any XML input containing a character within [<>&;] >as that triggers CDATA output in obj2xml. > >In versions prior to 0.47 there was no translation on parsing, so the output >was neither identical or equivalent if there were any XML escapes in the input. > >In 0.47 output is equivalent for XML escapes, but it does not handle any other >entities - mainly because it gets too damn difficult to work out the correct >behaviour - ie (X)HTML different to straight XML. > >Can certainly make it optional. Wondering if there also needs to be an option for >full entity decode (probably by allowing a callout). >
Entity handling has now been reverted to the way it worked in version 0.45. The changes make in 0.46 and 0.47 do not reflect the original intent or design of the module, and they also cause problems for people who do not expect them to occur. I have reverted maintenance of the module back to myself, and will be addressing all the outstanding bugs soon. My apologies for the lack of proper maintenance and the confusing changes made by the co-maintainer.