Skip Menu |

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 51442
Status: rejected
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: jozef [...] kutej.net
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.70
Fixed in: (no value)



Subject: <VALUE/> vs <VALUE></VALUE>
Hi Peter, $ perl -MXML::LibXML -le 'print XML::LibXML->new()->parse_string("<VALUE></VALUE>")->toString' output: <?xml version="1.0"?> <VALUE/> $ perl -MXML::LibXML -le 'print XML::LibXML->new()->parse_string("<VALUE/>")->toString' output: <?xml version="1.0"?> <VALUE/> When generating these are two different cases: perl -MXML::LibXML -le '$d=XML::LibXML::Document->new("1.0", "UTF-8"); $e=$d->createElement("VALUE"); print $e->toString; $e->addChild($d->createTextNode("")); print $e->toString' output: <VALUE/> <VALUE></VALUE> Now I'm not sure if this is a bug, but I think the parser should make the difference between <VALUE/> and <VALUE></VALUE>. In the first case there should not be any child element, in the second case there should be one child text node with empty string. Cheers, Jozef
Ahoj Jozef, this is not a bug, XML parsers are not supposed to distinguish between the two, see: http://www.w3.org/TR/REC-xml/#sec-starttags "[Definition: An element with no content is said to be empty.] The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag." As for the serializer: element with an empty text node is serialized with a pair of start and end tag rather than an empty tag in conformance with this non-normative recommendation (same document): "For interoperability, the empty-element tag should be used, and should only be used, for elements which are declared EMPTY." AFAIK, the serializer in libxml2 does not look into the DTD at all, but the presence of a text node (although empty) on the element node suggests that the element is not declared as EMPTY and therefore it should not be serialized using an empty tag. On the other hand, an element with no content at all looks as an EMPTY element and the serializer therefore chooses to serialize it using an empty tag. Best, -- Petr On Wed Nov 11 08:33:37 2009, JKUTEJ wrote: Show quoted text
> Hi Peter, > > $ perl -MXML::LibXML -le 'print > XML::LibXML->new()->parse_string("<VALUE></VALUE>")->toString' > > output: > > <?xml version="1.0"?> > <VALUE/> > > $ perl -MXML::LibXML -le 'print > XML::LibXML->new()->parse_string("<VALUE/>")->toString' > > output: > > <?xml version="1.0"?> > <VALUE/> > > When generating these are two different cases: > > perl -MXML::LibXML -le '$d=XML::LibXML::Document->new("1.0", "UTF-8"); > $e=$d->createElement("VALUE"); print $e->toString; > $e->addChild($d->createTextNode("")); print $e->toString' > > output: > > <VALUE/> > <VALUE></VALUE> > > Now I'm not sure if this is a bug, but I think the parser should make > the difference between <VALUE/> and <VALUE></VALUE>. In the first case > there should not be any child element, in the second case there should > be one child text node with empty string. > > Cheers, > Jozef