Subject: | Problem with XML::SAX::PurePerl - if a comment has its ending two dashes on one side, and the > on the other side, of a 4096 byte boundary, it's not recognized as commend end |
Date: | Tue, 30 Apr 2019 11:42:23 +0200 |
To: | bug-XML-SAX [...] rt.cpan.org |
From: | Guntram Blohm <gbl [...] guntram.de> |
Steps to reproduce:
Create an xml file by running this snippet and redirecting output to a file:
-------------------------------------------------------------------------------------------------------
my $start="<tag><inner><!--";
my $big="x"x(4096 - length($start) - 2);
my $end="--></inner><second><!--yyyy--></second></tag>";
print $start, $big, $end;
-------------------------------------------------------------------------------------------------------
Make sure no other parser is installed, i.e. XML::SAX::PurePerl is the
only parser in ParserDetails.ini
Then run this on the file:
-------------------------------------------------------------------------------------------------------
#!/usr/bin/perl
use XML::Simple qw(:strict);
foreach $anafile (@ARGV) {
print "--------- $anafile -----------\n";
print XMLin($anafile, (ForceArray => [], KeyAttr => {}) );
}
-------------------------------------------------------------------------------------------------------
Result:
End tag mismatch (second != inner) [Ln: 1, Col: 4132]
This is because the --> that ends the comment within the <inner> tag
isn't found, so the comment extends to the next comment end just before
</second>.
Happens to me on OpenSuse Leap 15.0 and on Ubuntu 18.04.2 LTS, with
PurePerl.pm saying $VERSION = '0.99'.