Subject: | Bug in XML-Twig handling Control Characters? |
Date: | Mon, 26 Feb 2007 15:57:20 +0000 |
To: | bug-XML-Twig [...] rt.cpan.org |
From: | Peter King <P.J.B.King [...] hw.ac.uk> |
I think there may be a bug in the handling of control characters by
XML::Twig
If the input file contains any character less than <space> except for
<tab>, <new line>, <return>, then the input of the file is truncated there.
It is not solved by using 'safe' output filter, because this does not
translate the characters in question.
I have attached a perl script that demonstrates this, with its input,
the output I get (the script will write the file output.xml) and the
version information from perl and uname
I think this bug has existed for a long time - it certainly manifested
itself in Twig3.23
All in all an excellent piece of code for fiddling with XML though -
thanks for your hard work in developing it.
Peter King
<?xml version="1.0"?>
<file>
<text>
<t1>a </t1>
<t2> bbb </t2>
</text>
<text>
<t1>a with CTRL-A :: </t1>
<t2> bbb </t2>
</text>
</file>
<?xml version="1.0"?>
<file>
<text>
<t1>ab Control-C :: Now octal 222 :’:</t1>
<t2>Second text OK</t2>
</text>
<text>
<t1>a </t1>
<t2> bbb </t2>
</text>
<text>
<t1>a with CTRL-A :</t1>
</text>
</file>
Linux lxpjbk 2.6.18-1.2200.fc5smp #1 SMP Sat Oct 14 17:15:35 EDT 2006 i686 i686 i386 GNU/Linux
#!/usr/bin/perl -w
#
# project selection and updating script
# called by CGI
# takes parameters directory
# type (UG|PG|ASE)
# staff (true)
# debug (true)
use lib "/u1/staff/pjbk/perl_libs/XML-Twig-3.29/blib/lib";
# XML processing library
#use strict;
use Fcntl; # for file locking definitions
use XML::Twig;
$input_file = "input.xml";
$output_file = "output.xml";
# add a new record
$t1_text = "ab Control-C :\cC: Now octal 222 :\222:";
$t2_text = "Second text OK";
my $t1 = new XML::Twig::Elt( 't1',$t1_text);
my $t2 = new XML::Twig::Elt( 't2',$t2_text);
my $text= new XML::Twig::Elt( 'text' , ( $t1, $t2 ));
$twig = new XML::Twig( pretty_print => 'indented',
output_filter => 'safe');
$twig -> safe_parsefile( $input_file);
&mydie ( "Problems with input file: $input_file")
if $twig == 0;
$root = $twig -> root;
$text -> paste( first_child => $root);
open(OUTPUTFILE, ">" . $output_file)
or &mydie("Can't open OUTPUTFILE: $output_file");
$twig->print(\*OUTPUTFILE);
close OUTPUTFILE;
exit;
This is perl, v5.8.8 built for i386-linux-thread-multi
Copyright 1987-2006, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.