Skip Menu |

This queue is for tickets about the Rose-HTML-Objects CPAN distribution.

Report information
The Basics
Id: 29131
Status: resolved
Priority: 0/
Queue: Rose-HTML-Objects

People
Owner: Nobody in particular
Requestors: cpan [...] funkreich.de
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.549
Fixed in: 0.606



Subject: Field type classes must "use utf8"
The various field type classes (Form::Field::Email, Form::Field::DateTime etc) must "use utf8" due to utf-8 encoded translation texts embedded in the __DATA__ block. Without the utf8 pragma those texts are not displayed correctly because perl doesn't recognize them as being utf-8 encoded.
From: JSIRACUSA [...] cpan.org
On Mon Sep 03 09:46:26 2007, TKREMER wrote: Show quoted text
> The various field type classes (Form::Field::Email, > Form::Field::DateTime etc) must "use utf8" due to utf-8 encoded > translation texts embedded in the __DATA__ block. Without the utf8 > pragma those texts are not displayed correctly because perl doesn't > recognize them as being utf-8 encoded.
What version of perl are you using? In 5.8.8, the docs say: "'use utf8' still needed to enable UTF-8/UTF-EBCDIC in scripts As a compatibility measure, the 'use utf8' pragma must be explicitly included to enable recognition of UTF-8 in the Perl scripts themselves (in string or regular expression literals, or in identifier names) on ASCII-based machines or to recognize UTF- EBCDIC on EBCDIC-based machines. These are the only times when an explicit 'use utf8' is needed." The UTF-8 text in those files is not used "in string or regular expression literals, or in identifier names", and appears to work okay for me in perl 5.8.8. Are you using an earlier version? (Also keep in mind that, while I welcome bug reports, localization is not yet a public feature.)
Subject: Re: [rt.cpan.org #29131] Field type classes must "use utf8"
Date: Tue, 04 Sep 2007 10:36:29 +0200
To: bug-Rose-HTML-Objects [...] rt.cpan.org
From: Tobias Kremer <tobias [...] funkreich.de>
Quoting via RT <bug-Rose-HTML-Objects@rt.cpan.org>: Show quoted text
> What version of perl are you using? In 5.8.8, the docs say: > "'use utf8' still needed to enable UTF-8/UTF-EBCDIC in scripts As a > compatibility measure, the 'use utf8' pragma must be explicitly included > to enable recognition of UTF-8 in the Perl scripts themselves (in string > or regular expression literals, or in identifier names) on ASCII-based > machines or to recognize UTF- EBCDIC on EBCDIC-based machines. These > are the only times when an explicit 'use utf8' is needed." > The UTF-8 text in those files is not used "in string or regular > expression literals, or in identifier names", and appears to work okay > for me in perl 5.8.8. Are you using an earlier version?
Strange. I'm using 5.8.8 and adding a "use utf8" at the top of the files fixes the issue for me. --Tobias
On Mo. 03. Sep. 2007, 21:15:46, JSIRACUSA wrote: I have the same issue. Show quoted text
> On Mon Sep 03 09:46:26 2007, TKREMER wrote: > What version of perl are you using? In 5.8.8, the docs say: > > "'use utf8' still needed to enable UTF-8/UTF-EBCDIC in scripts As a > compatibility measure, the 'use utf8' pragma must be explicitly
included Show quoted text
> to enable recognition of UTF-8 in the Perl scripts themselves (in
string Show quoted text
> or regular expression literals, or in identifier names) on
ASCII-based Show quoted text
> machines or to recognize UTF- EBCDIC on EBCDIC-based machines.
These Show quoted text
> are the only times when an explicit 'use utf8' is needed." >
The Rose Localizer reads text strings from perl scripts below the __DATA__ section. These files are saved in utf8 and therefore you _must_ 'use utf8'. Let me give you an example: #!/usr/bin/perl $text = "blödsinn"; print length $text, "\n"; Save this as an utf8 encoded file, the output is: 9 (!) #!/usr/bin/perl use utf8; $text = "blödsinn"; print length $text, "\n"; Output: 8 The difference is, that $text is now tagged as utf8. This issue is confusing because if u print $text it's possible the first example shows "correct" output on machines with wrong unicode setup ... Show quoted text
> > (Also keep in mind that, while I welcome bug reports, localization
is Show quoted text
> not yet a public feature.)
That's no reason not to fix it :P
On Fri Nov 06 08:17:45 2009, chr wrote: Show quoted text
>> (Also keep in mind that, while I welcome bug reports, localization is
not yet Show quoted text
>> a public feature.)
> > That's no reason not to fix it :P
No, but it's a reason for the person who originally reported this bug to expect that it might not yet work correctly :) Anyway, localization now *is* a documented, public feature, and despite the specific-sounding text in the utf8 module documentation ("in string or regular expression literals, or in identifier names"), apparently this module must also be loaded in order to properly handle UTF-8 text stored in the __DATA__ section of a module. The change has been checked into SVN. Please check it out and let me know if it works correctly for you. If so, I will cut a release and close this bug.
On Fr. 06. Nov. 2009, 10:00:47, JSIRACUSA wrote: Show quoted text
> On Fri Nov 06 08:17:45 2009, chr wrote:
> > That's no reason not to fix it :P
> > No, but it's a reason for the person who originally reported this
bug to Show quoted text
> expect that it might not yet work correctly :) > > Anyway, localization now *is* a documented, public feature, and
despite Show quoted text
> the specific-sounding text in the utf8 module documentation ("in
string Show quoted text
> or regular expression literals, or in identifier names"), apparently > this module must also be loaded in order to properly handle UTF-8
text Show quoted text
> stored in the __DATA__ section of a module. > > The change has been checked into SVN. Please check it out and let
me Show quoted text
> know if it works correctly for you. If so, I will cut a release and > close this bug.
I just tested: It works! :) Tx for patching! ps: rule of thumb: use utf8 if u assign utf8 strings from a utf8 perl source file. There is no need for it if your module just receive $text and process it ... if $text is properly tagged as utf8.
So this bug is fixed in 0.606. Some final notes. If this release breaks now your application make sure u have set up everything to unicode or not. Here is a little checklist: use Catalyst qw/Unicode/; DBI: pg_enable_utf8 =>1 (see DBD::<driver> for documentation) Template Toolkit -> config(ENCODING => 'utf-8') Finally let's have a look again at the code example: #!/usr/bin/perl $text = "blödsinn"; print length $text, ": $text\n"; Save this as an utf8 encoded file, the output is: 9: blödsinn #!/usr/bin/perl use utf8; $text = "blödsinn"; print length $text, ": $text\n"; Output: 8: bl?dsinn WTF u ask now? The answer is: perl's STD/IN/OUT/ERR is not utf8 per default. To enable this use -C option or set PERL_UNICODE=SDL (see perlrun) Finally the output of example is now correct: 9: blödsinn 8: blödsinn so much pain in the ase for that "nonsense" :P cu ps: open should do the right thing when reading text files ... accoding to documentation ...