Skip Menu |

This queue is for tickets about the XML-SAX-PurePerl CPAN distribution.

Report information
The Basics
Id: 57822
Status: new
Priority: 0/
Queue: XML-SAX-PurePerl

People
Owner: Nobody in particular
Requestors: triddle [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)

Attachments
10-cvwiki-20091027-pages-articles.xml.bz2



Subject: XML::SAX::PurePerl is unable to handle seemingly valid unicode character
XML::SAX::PurePerl chokes on a UTF character that every other parser I'm testing passes through with out issue. Specifically the following warning is generated only in XML::SAX::PurePerl and not any other tested SAX driver or XML parser: running test_cases//XML-SAX-PurePerl.t datastore/10-cvwiki-20091027-pages-articles.xml: utf8 "\xBF" does not map to Unicode at /opt/local/lib/perl5/site_perl/5.8.9/XML/SAX/PurePerl/Reader/Stream.pm line 37. The error causes the output of XML::SAX::PurePerl to be invalid compared to the other parsers. Steps to reproduce: attempt to parse the attached bzipped XML dump file from the Chuvash language Wikipedia. Expected results: properly unescaping the "\xBF" UTF value with out generating a warning and generating proper output.
Subject: 10-cvwiki-20091027-pages-articles.xml.bz2

Message body not shown because it is not plain text.