Subject: | IgnoreWhitespaces won't work correctly (SOLVED) |
Date: | Thu, 19 Apr 2012 19:44:27 +0200 |
To: | bug-XML-Mini [...] rt.cpan.org |
From: | Manfred Lotz <manfred.lotz [...] yahoo.de> |
Hi there,
In order to show what goes wrong here a minimal example:
#! /usr/bin/perl
use strict;
use warnings;
use lib qw( /home/manfred/hub/lib );
use Data::Dumper;
use XML::Minix::Document;
my $XMLString = "<book> Learning Perl </book>";
my $xmlDoc = XML::Minix::Document->new();
$XML::Minix::IgnoreWhitespaces = 0;
# init the doc from an XML string
$xmlDoc->parse($XMLString);
my $xmlHash = $xmlDoc->toHash();
print Dumper($xmlHash);
The output of the script is:
$VAR1 = {
'book' => 'Learning Perl '
};
instead of
$VAR1 = {
'book' => ' Learning Perl '
};
Reason seems to be that in Document.pm there is
if ($XMLString =~
m/^\s*(<\s*([^\s>]+)([^>]+)\/\s*>| # <unary \/>
<\?\s*([^\s>]+)\s*([^>]*)\?>| # <? headers ?>
<!--(.+?)-->| # <!-- comments -->
<!\[CDATA\s*\[(.*?)\]\]\s*>\s*| # CDATA
<!DOCTYPE\s*([^\[>]*)(\[.*?\])?\s*>\s*| # DOCTYPE
<!ENTITY\s*([^"'>]+)\s*(["'])([^\11]+)\11\s*>\s*| # ENTITY
([^<]+))(.*)/xogsmi) # plain text
This means that in all cases leading white space will be deleted even
if plaintext.
I changed it like this:
if ($XMLString =~
m/^\s*(<\s*([^\s>]+)([^>]+)\/\s*>| # <unary \/>
<\?\s*([^\s>]+)\s*([^>]*)\?>| # <? headers ?>
<!--(.+?)-->| # <!-- comments -->
<!\[CDATA\s*\[(.*?)\]\]\s*>\s*| # CDATA
<!DOCTYPE\s*([^\[>]*)(\[.*?\])?\s*>\s*| # DOCTYPE
<!ENTITY\s*([^"'>]+)\s*(["'])([^\11]+)\11\s*>\s*| # ENTITY
([^<]+))(.*)/xogsmi) # plain text
and now it is working fine.
--
Manfred