Skip Menu |

This queue is for tickets about the XML-Parser CPAN distribution.

Report information
The Basics
Id: 40712
Status: resolved
Priority: 0/
Queue: XML-Parser

People
Owner: Nobody in particular
Requestors: pthespis [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 2.36
Fixed in: 2.40



Subject: XML::Parser fails to parse element with the Euro and the Drachma signs
Hi, I'm experiencing a problem with using XML::Parser to parse an ISO-8859-7 XML element with the Euro and the Drachma signs. Here's my configuration. Linux Fedora Core 9 perl, v5.10.0 expat_2.0.1 XML::Parser 2.36 And here's a script which reproduces the problem described. -------------------------------------- #!/usr/bin/perl -w use XML::Parser; my $p1 = new XML::Parser(ProtocolEncoding=> 'ISO-8859-7'); my $ed = chr(0xA4) . chr(0xA5); # Euro and drachma signs my $ab = chr(0xE2) . chr(0xE3); # Greek alpha and beta $p1->parse("<p>$ed</p>"); # This fails # $p1->parse("<p>$ab</p>"); # This doesn't -------------------------------------- What I get when running the script is: not well-formed (invalid token) at line 1, column 3, byte 3 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 Any ideas why this happens? If it's any help, according to wikipedia <http://en.wikipedia.org/wiki/ISO_8859-7>, the updated 2003 version of ISO-8859-7 added three characters (euro sign, drachma sign, and Greek Ypogegrammeni) to the standard. Best, pthespis