Skip Menu |

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 52103
Status: resolved
Priority: 0/
Queue: Encode

People
Owner: Nobody in particular
Requestors: NWELLNHOF [...] cpan.org
Cc: JMEHNLE [...] cpan.org
AdminCc:

Bug Information
Severity: Important
Broken in:
  • 2.37
  • 2.39
Fixed in: (no value)



Subject: Encode::MIME::Header encoded words not separated by white space
RFC 2047 states, in section 5.(1) "Ordinary ASCII text and 'encoded-word's may appear together in the same header field. However, an 'encoded-word' that appears in a header field defined as '*text' MUST be separated from any adjacent 'encoded-word' or 'text' by 'linear-white-space'." Similarly in subsections (2) and (3). But I get the following result: $ perl -MEncode -Mutf8 -e 'print encode("MIME-Q", "found art, objet trouvé"), "\n"' found art,=?UTF-8?Q?=20objet=20trouv=C3=A9?= The decoding is also wrong. The following example shouldn't decode anything. $ perl -MEncode -e 'binmode(STDOUT, ":utf8"); print decode("MIME-Q", "found art,=?UTF-8?Q?=20objet=20trouv=C3=A9?="), "\n"' found art, objet trouvé See section 6.1.(1) in the RFC.
This bug has bitten me several times, recently. Here's another demonstrative example: use strict; use JSON 2; use Encode; my $json = q[{"subject":"Les Cr\u00e9ations de La Plata VOUS invite \u00e0 l'exposition de design Puro Dise\u00f1o"}]; my $struct = JSON->new->decode($json); print Encode::encode('utf-8', $struct->{subject}), "\n"; my $mime = Encode::encode('MIME-Header', $struct->{subject}); print "$mime\n"; It ends up with this substring in it: "==?=' " Surely there should never be a case where ?= is followed by nonspace! Here is another: use utf8; use strict; use Encode; my $string = <<'END'; xxxxxxxxxx xxxxxxxxxxx xx xxxxxxxxx xxxxxx xx xxx xxxx xx xxx xxxxxxxxx x.x. xxxxxxx xxxxx xxxxxxxx END chomp $string; my $encoded = encode('MIME-Header', $string); my $decoded = decode('MIME-Header', $encoded); use Data::Dumper; print Dumper([ $encoded, $decoded, ]); The word "x.x." is broken between "x." and "x." Encode itself will properly round trip this, but other readers will see that the "encoded" form actually contains no escaped elements and present it verbatim -- breaking the content by leaving that spurious space in place! -- rjbs
From: henrik.pauli [...] gmail.com
On Fri 2010. May 14 15:06:20, RJBS wrote: Show quoted text
> my $json = q[{"subject":"Les Cr\u00e9ations de La Plata VOUS invite > \u00e0 l'exposition de > design Puro Dise\u00f1o"}];
We’ve had a patch for this in UHU-Linux in our Perl package for a while, I think. I’ve just ported it to 5.12.2 and I hope it still does what it should. I took your string as an example and here’s the result: $ perl -MEncode -Mutf8 -e 'print encode("MIME-Q", "Les Cr\x{00e9}ations de La Plata VOUS invite \x{00e0} l'"'"'exposition de design Puro Dise\x{00f1}o"), "\n"' =?UTF-8?Q?Les=20Cr=C3=A9ations=20de=20La=20P?= =?UTF-8?Q?lata=20VOUS=20invite=20=C3=A0=20l?=' =?UTF-8?Q?exposition=20de=20design=20?= =?UTF-8?Q?Puro=20Dise=C3=B1o?= (Gotta love shell escapes for that apostrophe.) I’ll attach the patch, and I hope it’s of some use.
Subject: space-in-mime-encoded-header.patch
diff -Naurdp perl-5.12.2/cpan/Encode/lib/Encode/MIME/Header.pm perl-5.12.2-ืmime-header/cpan/Encode/lib/Encode/MIME/Header.pm --- perl-5.12.2/cpan/Encode/lib/Encode/MIME/Header.pm 2010-09-05 17:14:32.000000000 +0200 +++ perl-5.12.2-ืmime-header/cpan/Encode/lib/Encode/MIME/Header.pm 2010-09-15 22:29:54.000000000 +0200 @@ -127,11 +127,12 @@ sub encode($$;$) { for my $word (@word) { use bytes (); if ( bytes::length($subline) + bytes::length($word) > - $obj->{bpl} ) + $obj->{bpl} - 1 ) { push @subline, $subline; $subline = ''; } + $subline .= ' ' if ($subline =~ /\?=$/ and $word =~ /^=\?/); $subline .= $word; } $subline and push @subline, $subline; diff -Naurdp perl-5.12.2/cpan/Encode/t/mime-header.t perl-5.12.2-ืmime-header/cpan/Encode/t/mime-header.t --- perl-5.12.2/cpan/Encode/t/mime-header.t 2010-09-05 17:14:32.000000000 +0200 +++ perl-5.12.2-ืmime-header/cpan/Encode/t/mime-header.t 2010-09-15 22:31:58.000000000 +0200 @@ -74,8 +74,8 @@ EOS my $bheader =<<'EOS'; From:=?UTF-8?B?IOWwj+mjvCDlvL4g?=<dankogai@dan.co.jp> -To: dankogai@dan.co.jp (=?UTF-8?B?5bCP6aO8?==Kogai,=?UTF-8?B?IOW8vg==?==Dan - ) +To: dankogai@dan.co.jp (=?UTF-8?B?5bCP6aO8?==Kogai,=?UTF-8?B?IOW8vg==?== + Dan) Subject: =?UTF-8?B?IOa8ouWtl+OAgeOCq+OCv+OCq+ODiuOAgeOBsuOCieOBjOOBquOCkuWQq+OCgA==?= =?UTF-8?B?44CB6Z2e5bi444Gr6ZW344GE44K/44Kk44OI44Or6KGM44GM5LiA5L2T5YWo?= @@ -123,6 +123,6 @@ is(Encode::encode('MIME-Q', "\x{fc}"), ' my $rt42627 = Encode::decode_utf8("\x{c2}\x{a3}xxxxxxxxxxxxxxxxxxx0"); is(Encode::encode('MIME-Q', $rt42627), - '=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?==?UTF-8?Q?0?=', + '=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?= =?UTF-8?Q?0?=', 'MIME-Q encoding does not truncate trailing zeros'); __END__;
Thank you. Your patch is applied in my repository. Dan the Maintainer Thereof On Wed Sep 15 17:57:12 2010, ralesk wrote: Show quoted text
> On Fri 2010. May 14 15:06:20, RJBS wrote:
> > my $json = q[{"subject":"Les Cr\u00e9ations de La Plata VOUS invite > > \u00e0 l'exposition de > > design Puro Dise\u00f1o"}];
> > We’ve had a patch for this in UHU-Linux in our Perl package for a while, > I think. I’ve just ported it to 5.12.2 and I hope it still does what it > should. I took your string as an example and here’s the result: > > $ perl -MEncode -Mutf8 -e 'print encode("MIME-Q", "Les Cr\x{00e9}ations > de La Plata VOUS invite \x{00e0} l'"'"'exposition de design Puro > Dise\x{00f1}o"), "\n"' > =?UTF-8?Q?Les=20Cr=C3=A9ations=20de=20La=20P?= > =?UTF-8?Q?lata=20VOUS=20invite=20=C3=A0=20l?=' > =?UTF-8?Q?exposition=20de=20design=20?= =?UTF-8?Q?Puro=20Dise=C3=B1o?= > > (Gotta love shell escapes for that apostrophe.) > > I’ll attach the patch, and I hope it’s of some use.