CC: | shlomif [...] iglu.org.il |
Subject: | UTF-8 (and other Unicode encodings?) BOM cause the package to fail |
Date: | Wed, 07 Jul 2010 09:42:45 +0300 |
To: | bug-Config-IniFiles [...] rt.cpan.org |
From: | Meir Guttman <meir [...] guttman.co.il> |
Dear folks,
The other day I discovered the hard way that the Config::IniFiles package
fails to process UTF-8 Unicode encoded INI files when the file also includes
a BOM (Byte Order Marker) signature.
Attached are two INI files, one with a BOM, another is without. Other than
this the two are identical. As anyone can see (in a Hex view), the 3-byte
BOM at the very beginning of the BOM file is "EF BB BF".
Also attached is a small Perl Script to demonstrate the result. (You have of
course to edit it to switch between the BOM and the no-BOM versions.) The
outcome of it when using the BOM INI file is:
Line 1 in file utf8_bom.ini is mal-formed:
∩�[┐[General]
2: parameter found outside a section
Please note the three "garbage" characters on my (Hebrew) cmd window.
As for a correcting patch, I am afraid I am too much of a newbie to offer
that. But may be all which is required is a "use encoding 'utf8';"
statement?
Regards,
Meir
Message body not shown because it is not plain text.
Message body is not shown because sender requested not to inline it.
Message body not shown because it is not plain text.