Skip Menu |

This queue is for tickets about the Locale-PO CPAN distribution.

Report information
The Basics
Id: 54064
Status: resolved
Priority: 0/
Queue: Locale-PO

People
Owner: Nobody in particular
Requestors: astricker [...] futurelab.ch
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.21
Fixed in: (no value)



Subject: Support UTF-8 and other encodings of PO files
I noticed the output of a tool using Locale::PO was doubled UTF-8 encoded. After digging into the tool I noticed the source was the missing decoding while reading in the PO file in Locale::PO. (My PO file is UTF-8 encoded). Simply adding binmode(IN, ":encoding(UTF-8)"); after open solved the problem for me. But this certainly breaks all other cases where the PO file is not UTF-8 encoded. So I found a way to wait until the Header (msgid "") is read in, extracting the encoding information from this header Content-Type: text/plain; encoding=... and use this encoding using a binmode() call on filehandle, to change the encoding. This works fine, and as I understand the msgid "" is always the first entry. But there are two possible problems: 1. If msgid "" is not the first entry 2. If msgstr contains UTF-8 Strings itself (e.g. Author) I'll give it a try to enhance the module to restart parsing the file after a different encoding was found. This solves both problems above. Anyway, the attached patch improves the situation.
Subject: locale-po.diff
--- PO.pm 2010-01-28 16:18:15.000000000 +0100 +++ src/libs/gettext/lib/Locale/PO.pm 2010-01-28 16:18:59.000000000 +0100 @@ -336,6 +336,13 @@ $po->msgid_plural( $buffer{msgid_plural} ) if defined $buffer{msgid_plural}; $po->msgstr( $buffer{msgstr} ) if defined $buffer{msgstr}; $po->msgstr_n( $buffer{msgstr_n} ) if defined $buffer{msgstr_n}; + + if ($po->msgid eq '""') { + my $header = $po->msgstr; + $header =~ /Content-Type:.*charset\s*=\s*([^\\\s]+)($|\\n)/ms; + my $encoding = $1 || "UTF-8"; + binmode(IN, ":encoding($encoding)"); + } # ashash if ($ashash) {
From: astricker [...] futurelab.ch
Thanks for fixing this.