Skip Menu |

This queue is for tickets about the XML-Mini CPAN distribution.

Report information
The Basics
Id: 76704
Status: new
Priority: 0/
Queue: XML-Mini

People
Owner: Nobody in particular
Requestors: manfred.lotz [...] yahoo.de
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: IgnoreWhitespaces won't work correctly (SOLVED)
Date: Thu, 19 Apr 2012 19:44:27 +0200
To: bug-XML-Mini [...] rt.cpan.org
From: Manfred Lotz <manfred.lotz [...] yahoo.de>
Hi there, In order to show what goes wrong here a minimal example: #! /usr/bin/perl use strict; use warnings; use lib qw( /home/manfred/hub/lib ); use Data::Dumper; use XML::Minix::Document; my $XMLString = "<book> Learning Perl </book>"; my $xmlDoc = XML::Minix::Document->new(); $XML::Minix::IgnoreWhitespaces = 0; # init the doc from an XML string $xmlDoc->parse($XMLString); my $xmlHash = $xmlDoc->toHash(); print Dumper($xmlHash); The output of the script is: $VAR1 = { 'book' => 'Learning Perl ' }; instead of $VAR1 = { 'book' => ' Learning Perl ' }; Reason seems to be that in Document.pm there is if ($XMLString =~ m/^\s*(<\s*([^\s>]+)([^>]+)\/\s*>| # <unary \/> <\?\s*([^\s>]+)\s*([^>]*)\?>| # <? headers ?> <!--(.+?)-->| # <!-- comments --> <!\[CDATA\s*\[(.*?)\]\]\s*>\s*| # CDATA <!DOCTYPE\s*([^\[>]*)(\[.*?\])?\s*>\s*| # DOCTYPE <!ENTITY\s*([^"'>]+)\s*(["'])([^\11]+)\11\s*>\s*| # ENTITY ([^<]+))(.*)/xogsmi) # plain text This means that in all cases leading white space will be deleted even if plaintext. I changed it like this: if ($XMLString =~ m/^\s*(<\s*([^\s>]+)([^>]+)\/\s*>| # <unary \/> <\?\s*([^\s>]+)\s*([^>]*)\?>| # <? headers ?> <!--(.+?)-->| # <!-- comments --> <!\[CDATA\s*\[(.*?)\]\]\s*>\s*| # CDATA <!DOCTYPE\s*([^\[>]*)(\[.*?\])?\s*>\s*| # DOCTYPE <!ENTITY\s*([^"'>]+)\s*(["'])([^\11]+)\11\s*>\s*| # ENTITY ([^<]+))(.*)/xogsmi) # plain text and now it is working fine. -- Manfred
Subject: Re: [rt.cpan.org #76704] AutoReply: IgnoreWhitespaces won't work correctly (SOLVED)
Date: Thu, 19 Apr 2012 20:33:45 +0200
To: bug-XML-Mini [...] rt.cpan.org
From: Manfred Lotz <manfred.lotz [...] yahoo.de>
On Thu, 19 Apr 2012 13:44:41 -0400 "Bugs in XML-Mini via RT" <bug-XML-Mini@rt.cpan.org> wrote: Show quoted text
> [rt.cpan.org #76704]
Ooops sorry. I didn't really insert the correction (but copied the original code) which is this: if ($XMLString =~ m/(^\s*<\s*([^\s>]+)([^>]+)\/\s*>| # <unary \/> ^\s*<\?\s*([^\s>]+)\s*([^>]*)\?>| # <? headers ?> ^\s*<!--(.+?)-->| # <!-- comments --> ^\s*<!\[CDATA\s*\[(.*?)\]\]\s*>\s*| # CDATA ^\s*<!DOCTYPE\s*([^\[>]*)(\[.*?\])?\s*>\s*| # DOCTYPE ^\s*<!ENTITY\s*([^"'>]+)\s*(["'])([^\11]+)\11\s*>\s*| # ENTITY ([^<]+))(.*)/xogsmi) # plain text Now leading whitespace will be deleted for all cases except plaintext. -- Manfred