Skip Menu |

This queue is for tickets about the XML-Twig CPAN distribution.

Report information
The Basics
Id: 18944
Status: resolved
Priority: 0/
Queue: XML-Twig

People
Owner: MIROD [...] cpan.org
Requestors: cmccutcheon [...] oneil.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 3.23
Fixed in: 3.24



Subject: XML::Twig::Elt->set_cdata() is broken.
The following code will demonstrate the problem: #!/usr/bin/perl -w use strict; use XML::Twig; my $xml = XML::Twig->new(); my $twig = $xml->safe_parse("<foo><bar><![CDATA[asdf]]></bar><baz>qwer</baz></foo>"); my $root = $twig->root; print $root->sprint, "\n"; my $broken = 1; if($broken) { # These lines show the problem. my $elt = XML::Twig::Wlt->new("qux"); $elt->set_cdata("test this '<' & this '>'"); $elt->paste('last_child', $root); print $root->sprint, "\n"; } else { # These lines show what was expected. my $elt = XML::Twig::Wlt->new('#CDATA' => "test this '<' & this '>'")->wrap_in("qux"); $elt->paste('last_child', $root); print $root->sprint, "\n"; }
Subject: Re: [rt.cpan.org #18944] XML::Twig::Elt->set_cdata() is broken.
Date: Thu, 27 Apr 2006 13:57:01 +0200
To: bug-XML-Twig [...] rt.cpan.org
From: Michel Rodriguez <mirod [...] xmltwig.com>
Guest via RT wrote: Show quoted text
> Wed Apr 26 18:50:12 2006: Request 18944 was acted upon. > Transaction: Ticket created by guest > Queue: XML-Twig > Subject: XML::Twig::Elt->set_cdata() is broken. > Owner: Nobody > Requestors: cmccutcheon@oneil.com > Status: new > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=18944 > > > > The following code will demonstrate the problem: > > #!/usr/bin/perl -w > > use strict; > use XML::Twig; > > my $xml = XML::Twig->new(); > my $twig = > $xml->safe_parse("<foo><bar><![CDATA[asdf]]></bar><baz>qwer</baz></foo>"); > my $root = $twig->root; > print $root->sprint, "\n"; > > my $broken = 1; > if($broken) > { > # These lines show the problem. > my $elt = XML::Twig::Wlt->new("qux"); > $elt->set_cdata("test this '<' & this '>'"); > $elt->paste('last_child', $root); > print $root->sprint, "\n"; > } > else > { > # These lines show what was expected. > my $elt = XML::Twig::Wlt->new('#CDATA' => "test this '<' & this > '>'")->wrap_in("qux"); > $elt->paste('last_child', $root); > print $root->sprint, "\n"; > }
Hi, This is normal behavior, you are creating an element with a name, not a text (#PCDATA or #CDATA) one, so the cdata you are setting is just not used. The bug would be that set_cdata should check that and die if you call it on a non-CDATA element. OTOH what you do looks rather sensible, so in the grand tradition of XML::Twig dwimmery, if you install the development version that's on http://xmltwig.com/xmltwig/ your example will work just fine. Note that I also added an option to allow you to create directly the element with an included CDATA section: $root->insert_new_elt( last_child => qux => { '#CDATA' => 1 }, "test this '<' & this '>'"); (the new thing is that you can use the '#CDATA' => 1 "fake" attribute, the insert_new_elt method is older). Let me know if this works for you. Thanks -- Michel Rodriguez Perl &amp; XML xmltwig.com
Subject: RE: [rt.cpan.org #18944] AutoReply: XML::Twig::Elt->set_cdata() is broken.
Date: Thu, 27 Apr 2006 09:18:38 -0400
To: <bug-XML-Twig [...] rt.cpan.org>
From: "Calvin McCutcheon" <cmccutcheon [...] oneil.com>
Greetings, Thank-you for the fix! Once you explained #CDATA is a full fledged element, I understood why XML::Twig was behaving the way it was. The misunderstanding was on my part; mainly because I was expecting ->set_cdata to abstract away the details of working with character text in XML (sort of how ->set_text does). Maybe this could be an extension to how ->set_text works. That is, if ->set_text is called with only a string it behaves as it currently does. But if ->set_text is called with something like ->set_text('#CDATA' => $string), then it would create a #CDATA element instead of a #PCDATA element. On a slight tangent (and out of sheer curiosity), is there a historical reason for ->set_text to cut all the children out of an element and create a single #PCDATA element? The reason this behavior was/is needed is because I'm creating an element much earlier in a process with no knowledge of what is going to get stuffed into it. Today, armed with my new-found knowledge of #CDATA being a real element, I also tried: my $elt = XML::Twig::Elt->new("qux"); my $data = XML::Twig::Elt->new('#CDATA' => "test this '<' & this '>'"); $data->paste($elt); $elt->paste('last_child', $root); print $root->sprint, "\n"; which works; but the above code seems a little verbose just to declare a specific section of text in an element as being #CDATA. Thank-you for sending DWIM to the rescue; now #CDATA is almost as transparent to work with as #PCDATA :D . ********************************************************************** Confidentiality Notice The information contained in this e-mail is confidential and intended for use only by the person(s) or organization listed in the address. If you have received this communication in error, please contact the sender at O'Neil & Associates, Inc., immediately. Any copying, dissemination, or distribution of this communication, other than by the intended recipient, is strictly prohibited. **********************************************************************
Subject: Re: [rt.cpan.org #18944] AutoReply: XML::Twig::Elt->set_cdata() is broken.
Date: Thu, 27 Apr 2006 15:49:17 +0200
To: bug-XML-Twig [...] rt.cpan.org
From: Michel Rodriguez <mirod [...] xmltwig.com>
cmccutcheon@oneil.com via RT wrote: Show quoted text
> Queue: XML-Twig > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=18944 > > > Greetings, > > Thank-you for the fix! Once you explained #CDATA is a full fledged > element, I understood why XML::Twig was behaving the way it was. The > misunderstanding was on my part; mainly because I was expecting > ->set_cdata to abstract away the details of working with character text > in XML (sort of how ->set_text does).
It does now. Show quoted text
> Maybe this could be an extension to how ->set_text works. That is, if > ->set_text is called with only a string it behaves as it currently does. > But if ->set_text is called with something like ->set_text('#CDATA' => > $string), then it would create a #CDATA element instead of a #PCDATA > element. On a slight tangent (and out of sheer curiosity), is there a > historical reason for ->set_text to cut all the children out of an > element and create a single #PCDATA element?
With the change, set_cdata works in a way that's very similar to set_text, so that should be OK. And #CDATA and #PCDATA are separate elements because of mixed content: in XHTML for example, a single p element like '<p>foo <b>bar</b> baz</p>' contains 3 children, 2 #PCDATA (text nodes in the DOM) and 1 b element. Show quoted text
> > The reason this behavior was/is needed is because I'm creating an > element much earlier in a process with no knowledge of what is going to > get stuffed into it. Today, armed with my new-found knowledge of #CDATA > being a real element, I also tried: > > my $elt = XML::Twig::Elt->new("qux"); > my $data = XML::Twig::Elt->new('#CDATA' => "test this '<' & this '>'"); > $data->paste($elt); > $elt->paste('last_child', $root); > print $root->sprint, "\n"; > > which works; but the above code seems a little verbose just to declare a > specific section of text in an element as being #CDATA. Thank-you for > sending DWIM to the rescue; now #CDATA is almost as transparent to work > with as #PCDATA :D .
I have just one question: do you really need the CDATA section? When you print (or sprint, or flush) the twig, special characters will be automatically escaped, so you can write my $elt = XML::Twig::Elt->new(qux => "test this '<' & this '>'"); and when you print the element you will get valid XML: <qux>test this '&lt;' &amp; this '>'</qux> A parser reading this will send nearly the same information to the application than with the CDATA version. Hope that helps -- Michel Rodriguez Perl &amp; XML xmltwig.com
Subject: RE: [rt.cpan.org #18944] AutoReply: XML::Twig::Elt->set_cdata() is broken.
Date: Thu, 27 Apr 2006 10:20:45 -0400
To: <bug-XML-Twig [...] rt.cpan.org>
From: "Calvin McCutcheon" <cmccutcheon [...] oneil.com>
Show quoted text
>> The reason this behavior was/is needed is because I'm creating an >> element much earlier in a process with no knowledge of what is going
to Show quoted text
>> get stuffed into it. Today, armed with my new-found knowledge of
#CDATA Show quoted text
>> being a real element, I also tried: >> >> my $elt = XML::Twig::Elt->new("qux"); >> my $data = XML::Twig::Elt->new('#CDATA' => "test this '<' & this
'>'"); Show quoted text
>> $data->paste($elt); >> $elt->paste('last_child', $root); >> print $root->sprint, "\n"; >> >> which works; but the above code seems a little verbose just to
declare a Show quoted text
>> specific section of text in an element as being #CDATA. Thank-you
for Show quoted text
>> sending DWIM to the rescue; now #CDATA is almost as transparent to
work Show quoted text
>> with as #PCDATA :D .
> > I have just one question: do you really need the CDATA section? When
you Show quoted text
> print (or sprint, or flush) the twig, special characters will be > automatically escaped, so you can write > my $elt = XML::Twig::Elt->new(qux => "test this '<' & this '>'"); > > and when you print the element you will get valid XML: > > <qux>test this '&lt;' &amp; this '>'</qux> > > A parser reading this will send nearly the same information to the > application than with the CDATA version.
Actually, yes I do need the CDATA section because the real enclosed text (as opposed to the above test case) is valid [insert favorite programming language here] and I need XML::Twig to keep its hands off my data. Things would break horribly if I tried feeding automatically escaped content to [insert favorite compiler/interpreter for above language here] (the content never gets passed through an XML parser per se, only read out on a node by node basis with ->text, so the aforementioned compiler/interpreter would see the escaped content). ********************************************************************** Confidentiality Notice The information contained in this e-mail is confidential and intended for use only by the person(s) or organization listed in the address. If you have received this communication in error, please contact the sender at O'Neil & Associates, Inc., immediately. Any copying, dissemination, or distribution of this communication, other than by the intended recipient, is strictly prohibited. **********************************************************************