Bug #24836 for Encode: MIME-Q splits multibyte chars across encoded words

Wed Feb 07 15:36:37 2007 nick [...] aevum.de - Ticket created

Subject:

MIME-Q splits multibyte chars across encoded words

The following code use Encode; print encode('MIME-Q', encode_utf8('12345678901234567890ä')), "\n"; produces =?UTF-8?Q?12345678901234567890=C3?==?UTF-8?Q?=A4?= Note that the multibyte character 'ä' is split across the two encoded words.

Fri Apr 06 07:19:30 2007 DANKOGAI [...] cpan.org - Correspondence added

This is not a bug. You are merely doubly encoding. Dan the Encode Maintainer On Wed Feb 07 15:36:37 2007, nick.aevum.de wrote: Show quoted text

> The following code > > use Encode; > print encode('MIME-Q', encode_utf8('12345678901234567890ä')), "\n"; > > produces > > =?UTF-8?Q?12345678901234567890=C3?==?UTF-8?Q?=A4?= > > Note that the multibyte character 'ä' is split across the two encoded words. > >

Fri Apr 06 07:19:57 2007 The RT System itself - Status changed from 'new' to 'open'

Fri Apr 06 07:20:14 2007 DANKOGAI [...] cpan.org - Status changed from 'open' to 'resolved'

Sun Apr 08 11:56:16 2007 nick [...] aevum.de - Correspondence added

From:

nick [...] aevum.de

I'm not doubly encoding. The encode_utf8 is only because of bug #24418. And this has nothing to do with the fact that multibyte characters are split across encoded words. To quote RFC 2047: "Each 'encoded-word' MUST represent an integral number of characters. A multi-octet character may not be split across adjacent 'encoded-word's." Nick On Fri Apr 06 07:19:30 2007, DANKOGAI wrote: Show quoted text

> This is not a bug. You are merely doubly encoding. > > Dan the Encode Maintainer > > On Wed Feb 07 15:36:37 2007, nick.aevum.de wrote:

> > The following code > > > > use Encode; > > print encode('MIME-Q', encode_utf8('12345678901234567890ä')), "\n"; > > > > produces > > > > =?UTF-8?Q?12345678901234567890=C3?==?UTF-8?Q?=A4?= > > > > Note that the multibyte character 'ä' is split across the two

> encoded words.

> > > >

> >

Sun Apr 08 11:56:18 2007 The RT System itself - Status changed from 'resolved' to 'open'

Sun Apr 08 14:40:18 2007 DANKOGAI [...] cpan.org - Correspondence added

From:

DANKOGAI [...] cpan.org

On Wed Feb 07 15:36:37 2007, nick.aevum.de wrote: Show quoted text

> The following code > > use Encode; > print encode('MIME-Q', encode_utf8('12345678901234567890ä')), "\n";

WRONG^^^^^^^^^^^ The right way to do is: # source is in latin1 use Encode; print encode('MIME-Q', decode_utf8('12345678901234567890ä')), "\n"; or # source is in utf-8 use Encode; use utf8; # this makes sure literals are treated as UTF-8 string print encode('MIME-Q', '12345678901234567890ä'), "\n"; And you will get =?UTF-8?Q?12345678901234567890?==?UTF-8?Q?=C3=A4?= remember, encode() takes UTF-8 string as source string. Dan the Encode Maintainer

Sun Apr 08 14:41:03 2007 DANKOGAI [...] cpan.org - Status changed from 'open' to 'resolved'

Mon Apr 09 08:41:13 2007 nick [...] aevum.de - Correspondence added

From:

nick [...] aevum.de

I just upgraded to Encode 2.19 and everything works as expected. Thanks, Nick On Sun Apr 08 14:40:18 2007, DANKOGAI wrote: Show quoted text

> On Wed Feb 07 15:36:37 2007, nick.aevum.de wrote:

> > The following code > > > > use Encode; > > print encode('MIME-Q', encode_utf8('12345678901234567890ä')), "\n";

> WRONG^^^^^^^^^^^ > > The right way to do is: > > # source is in latin1 > use Encode; > print encode('MIME-Q', decode_utf8('12345678901234567890ä')), "\n"; > > or > > # source is in utf-8 > use Encode; > use utf8; # this makes sure literals are treated as UTF-8 string > print encode('MIME-Q', '12345678901234567890ä'), "\n"; > > And you will get > > =?UTF-8?Q?12345678901234567890?==?UTF-8?Q?=C3=A4?= > > remember, encode() takes UTF-8 string as source string. > > Dan the Encode Maintainer

Mon Apr 09 08:41:54 2007 The RT System itself - Status changed from 'resolved' to 'open'

Sat Apr 21 16:56:16 2007 DANKOGAI [...] cpan.org - Correspondence added

On Mon Apr 09 08:41:13 2007, nick.aevum.de wrote: Show quoted text

> > I just upgraded to Encode 2.19 and everything works as expected. > > Thanks, > Nick

Closing RT Dan the Encode Maintainer

Sat Apr 21 16:56:18 2007 DANKOGAI [...] cpan.org - Status changed from 'open' to 'resolved'