Skip Menu |

This queue is for tickets about the XML-Twig CPAN distribution.

Report information
The Basics
Id: 129783
Status: new
Priority: 0/
Queue: XML-Twig

People
Owner: Nobody in particular
Requestors: chrispitude [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: implement support for space-preserved elements (xml:space="preserve")
I am processing DITA documents that contain various space-preserved elements, such as this preformatted block that contains a cube with a boldfaced asterisk inside it: <topic><body><pre>+---+ | <b>*</b> | +---+</pre></body></topic> When I pretty-print a DITA document with this element, XML::Twig has a good built-in heuristic that sees the leading text and suppresses the pretty-printing: <topic> <body> <pre>+---+ | <b>*</b> | +---+</pre> </body> </topic> However, this heuristic is broken if the content does not begin with text, such as if the first text line begins with a tag: <topic><body><pre><b>+---+</b> <i>| <b>*</b> |</i> <b>+---+</b></pre></body></topic> In this case, the lines are pretty-printed as non-space-preserved XML: <topic> <body> <pre> <b>+---+</b> <i>| <b>*</b> |</i> <b>+---+</b> </pre> </body> </topic> XML::Twig should provide a per-element setting that does the following: * Prints the entire tag as space-preserved (no indenting or reformatting), whether its content begins with text or another element. * Exempts the element from a document-wide trim (see #125515: Provide a way to exclude tags from a twig-wide trim()). This request seems to be analogous to the official XML mechanism provided by xml:space="preserve", so there is precedent for the behavior. A testcase is included.
Subject: space-preserve.pl
#!/usr/bin/perl use strict; use warnings; use XML::Twig; my $xml = <<EOF; <topic><body><p>Good cube (spaces are preserved):</p><pre>+---+ | <b>*</b> | +---+</pre> <p>Bad cube (should be space-preserved):</p><pre><b>+---+</b> <i>| <b>*</b> |</i> <b>+---+</b></pre></body></topic> EOF my $twig=XML::Twig->new(); $twig->parse($xml); #$_->set_pretty_print('none') for $twig->root->children('pre'); $twig->print(pretty_print => 'indented');
I saw some checks in the Twig.pm code for whether the 'xml:space' attribute was set to 'preserve' and I got excited, but I set it on the preformatted blocks in the testcase: $_->set_att('xml:space', 'preserve') for $twig->descendants('pre'); and the output in the second example of the testcase is still broken. :(
It looks like setting the "keep_spaces_in" twig setting does what I need: my $twig=XML::Twig->new(keep_spaces_in => ['pre']); However, in my actual script, the problem is that my <pre> blocks are created on-the-fly by processing, not parsed from a source file. So, it looks like this issue becomes an enhancement request to be able to set the underlying "keep spaces in this thing" flag on a per-element basis.