Subject: | utf-8 issue |
Perl 5.8.4
Kernel: Linux 2.4.26
Distribution: Slackware 10.0
Hi,
I wrote a Perl script that reads a utf-8 file and display some part of it (definitions) on a web page. The definitions are written in french and encoded in utf-8. The definitions doesn't display correctly. Some special characters (like e acute) doesn't display correctly. I have found a workaround. I enclose the definition in <> and it works! I looked in the module to find the cause but I guess I'm not a Perl expert because I have trouble figuring out how it works.
Here's some important lines of my perl script:
#Get the definitions from the XML file
my $definition=$doc->getElementsByTagName('Definition')->item(0)-
Show quoted text
>getFirstChild->getNodeValue;
#remove trailing and leading <>
$description=~s/^<//;
$description=~s/>$//;
Here the structure of the XML file:
<definition>The definition enclosed in <></Definition>
Feel free to ask for more details.
Thanks!