Bug #46832 for Encode: Encode::MIME::Header and Russian

Wed Jun 10 14:55:19 2009 dimedrol [...] sviblovo.ru - Ticket created

Subject:

Encode::MIME::Header and Russian

It seems that encode("MIME-Header",...) program incorrectly works with Russian. I send test message with Novell Evolution mail client. In a letter subject there was a russian word 'тест' (test). Evolution convert it to this: Subject: =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82?= Then I use custom program (sub rfc2047conv, source attached), and get the same good result: =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82?= But if i use Encode::MIME::Header encode("MIME-Header", 'тест') the results looks different, and subject header show wrong in evolution mail client. Also I tried 'MIME-B' and 'MIME-Q' options , but without success: orig_str=тест encode with 'MIME-Header' =?UTF-8?B?w5HCgsOQwrXDkcKBw5HCgg==?= encode with 'MIME-B' =?UTF-8?B?w5HCgsOQwrXDkcKBw5HCgg==?= encode with 'MIME-Q' =?UTF-?Q?=C3=91=C2=82=C3=90=C2=B5=C3=91=C2=81=C3=91=C2=82?= encode with 'rfc2047conv' =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82?= Somebody knows, where an error ? My OS is # cat /etc/fedora-release Fedora release 9 (Sulphur) # rpm -qf /usr/lib/perl5/5.10.0/i386-linux-thread-multi/Encode/MIME/Header.pm perl-5.10.0-40.fc9.i386 # grep -i version /usr/lib/perl5/5.10.0/i386-linux-thread-multi/Encode/MIME/Header.pm our $VERSION = do { my @r = ( q$Revision: 2.5 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; # locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= I use this script for tests: #!/usr/bin/perl -w use Encode qw(encode decode); my $orig_str = 'тест'; print "orig_str=$orig_str\n"; my $res = encode("MIME-Header", $orig_str); print "encode with MIME-Header $res \n"; $res = encode("MIME-B", $orig_str); print "encode with MIME-B $res \n"; $res = encode("MIME-Q", $orig_str); print "encode with MIME-Q $res \n"; $res = rfc2047conv($orig_str, 'UTF-8'); print "encode with rfc2047conv $res \n"; # rfc2047conv (string, charset, prefix size); sub rfc2047conv{ my $str = shift; my $charset = uc(shift); my $init_len = shift || 0; my $len = length($str); return '' unless($len); my $begin = "=?$charset?Q?"; my $res = $begin; my $count = $init_len + length($begin); foreach my $c (split(//, $str)) { my ($repl, $repl_len); if($c eq '?' || $c eq '_' || $c eq '=' || $c lt ' ' || $c gt '~') { $repl = sprintf("=%X", ord($c)); $repl_len = 3; } elsif($c eq ' ') { $repl = '_'; $repl_len = 1; } else { $repl = $c; $repl_len = 1; } if($count + $repl_len > 72) { $res .= "?=\r\n " . $begin; $count = 1 + length($begin); } $res .= $repl; $count += $repl_len; } $res .= '?='; return $res; }

Wed Jul 08 09:22:38 2009 DANKOGAI [...] cpan.org - Correspondence added

On Wed Jun 10 14:55:19 2009, dimedrol wrote: Show quoted text

> It seems that encode("MIME-Header",...) program incorrectly works with > Russian. > I send test message with Novell Evolution mail client. > In a letter subject there was a russian word 'тест' (test). > Evolution convert it to this: > Subject: =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82?= > Then I use custom program (sub rfc2047conv, source attached), and get > the same good result: > =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82?= > > But if i use Encode::MIME::Header > encode("MIME-Header", 'тест') > the results looks different, and subject header show wrong in evolution > mail client. > Also I tried 'MIME-B' and 'MIME-Q' options , but without success: > > orig_str=тест > encode with 'MIME-Header' =?UTF-8?B?w5HCgsOQwrXDkcKBw5HCgg==?= > encode with 'MIME-B' =?UTF-8?B?w5HCgsOQwrXDkcKBw5HCgg==?= > encode with 'MIME-Q' > =?UTF-?Q?=C3=91=C2=82=C3=90=C2=B5=C3=91=C2=81=C3=91=C2=82?= > encode with 'rfc2047conv' =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82?= > > Somebody knows, where an error ?

Because you are treating 'тест' as bytes. Try the snippet below. Also read perldoc perluniintro. Dan the Maintainer Thereof #!/usr/local/bin/perl use strict; use warnings; use Encode; print encode("MIME-Header", 'тест'), "\n"; print encode("MIME-Header", decode_utf8('тест')), "\n"; __END__

Wed Jul 08 09:22:38 2009 The RT System itself - Status changed from 'new' to 'open'

Wed Jul 08 09:22:39 2009 DANKOGAI [...] cpan.org - Status changed from 'open' to 'resolved'