Skip Menu |

This queue is for tickets about the MIME-tools CPAN distribution.

Report information
The Basics
Id: 123341
Status: rejected
Priority: 0/
Queue: MIME-tools

People
Owner: dfs+pause [...] roaringpenguin.com
Requestors: tg [...] mirbsd.de
Cc: gregoa [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: 879205 [...] bugs.debian.org, bug-MIME-tools [...] rt.cpan.org
Subject: Re: Bug#879205: MIME::Words::encode_mimewords: double-encodes (produces Mojibake), produces too long lines
Date: Fri, 20 Oct 2017 20:19:42 +0000 (UTC)
To: gregor herrmann <gregoa [...] debian.org>, Dianne Skoll <dfs [...] roaringpenguin.com>
From: Thorsten Glaser <tg [...] mirbsd.de>
gregor herrmann dixit: Show quoted text
Dianne Skoll dixit: Show quoted text
>Below is my test program:
[…] Hi, I believe your test program is not correct. perl -MEncode -MMIME::Words -e 'print MIME::Words::encode_mimewords(Encode::encode("UTF-8", "Re: Bildungsurlaub für CCC-Fahrt? [THD#1424195]"), Charset => "UTF-8", Field => "Subject") . "\n";' This will do it. Alternatively (semi-tested) with yours: my $sample = "Re: Bildungsurlaub f\x{FC}r CCC-Fahrt? [THD#1424195]"; You were missing the “f” and “r” there. This is extremely sensitive to context. This looks to me as if the old code (with the bug from Debian #879204) was called, *then* things are re-read and re-encoded. bye, //mirabilos -- FWIW, I'm quite impressed with mksh interactively. I thought it was much *much* more bare bones. But it turns out it beats the living hell out of ksh93 in that respect. I'd even consider it for my daily use if I hadn't wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh
Subject: Re: [rt.cpan.org #123341] Re: Bug#879205: MIME::Words::encode_mimewords: double-encodes (produces Mojibake), produces too long lines
Date: Fri, 20 Oct 2017 16:30:53 -0400
To: bug-MIME-tools [...] rt.cpan.org
From: Dianne Skoll <dfs [...] roaringpenguin.com>
On Fri, 20 Oct 2017 16:25:44 -0400 "Thorsten Glaser via RT" <bug-MIME-tools@rt.cpan.org> wrote: Show quoted text
> You were missing the “f” and “r” there. This is extremely sensitive > to context.
I realized that after I sent the reply. However, even with: my $sample = "Re: Bildungsurlaub f\x{FC}r CCC-Fahrt? [THD#1424195]"; The output is: Out: Re: Bildungsurlaub =?UTF-8?Q?f=C3=BCr=20?=CCC-Fahrt? [THD#1424195] which is correct. Show quoted text
> This looks to me as if the old code (with the bug from Debian > #879204) was called, *then* things are re-read and re-encoded.
No, probably the original input was in UTF-8 already and was re-encoded by the call to Encode::encode. That's why I always run tests using \x{..} Unicode escapes rather than typing Unicode characters > \x{07f} directly into source code. Regards, Dianne.
Subject: Re: [rt.cpan.org #123341] Re: Bug#879205: MIME::Words::encode_mimewords: double-encodes (produces Mojibake), produces too long lines
Date: Fri, 20 Oct 2017 16:37:35 -0400
To: bug-MIME-tools [...] rt.cpan.org, 879204 [...] bugs.debian.org
From: Dianne Skoll <dfs [...] roaringpenguin.com>
Hi, This is not a bug in MIME::tools. The OP misunderstands how Perl works. He typed UTF-8 source code in and is double encoding it. Here's a test program: #=================================================================== use MIME::Words; use Encode; my $sample = "Re: Bildungsurlaub für CCC-Fahrt? [THD#1424195]"; my $utf8 = Encode::encode('UTF-8', $sample); my $out = MIME::Words::encode_mimewords($utf8, Charset => 'UTF-8'); print "Out: $out\n"; #=================================================================== If I run: perl test-utf8.pl Output is: Out: Re: Bildungsurlaub =?UTF-8?Q?f=C3=83=C2=BCr=20?=CCC-Fahrt? [THD#1424195] But that's because the word "für" is *already* UTF-8. If I tell Perl to convert UTF-8 in the source code to native Perl Unicode, the result is very different: perl -Mutf8 test-utf8.pl Output is: Out: Re: Bildungsurlaub =?UTF-8?Q?f=C3=BCr=20?=CCC-Fahrt? [THD#1424195] The OP should read "perldoc utf8" and should also not use UTF-8 directly as Perl source code; use \x{FC} rather than ü, etc. Regards, Dianne.
CC: 879205 [...] bugs.debian.org
Subject: Re: [rt.cpan.org #123341] Re: Bug#879205: MIME::Words::encode_mimewords: double-encodes (produces Mojibake), produces too long lines
Date: Fri, 20 Oct 2017 21:54:19 +0000 (UTC)
To: Dianne Skoll via RT <bug-MIME-tools [...] rt.cpan.org>
From: Thorsten Glaser <tg [...] mirbsd.de>
Dianne Skoll via RT dixit: Show quoted text
>No, probably the original input was in UTF-8 already and was re-encoded >by the call to Encode::encode. That's why I always run tests using
Hm, probably. I’d say you just found a bug in OTRS then ;-) (Just now it’s going to be another tricky thing to figure out where exactly and how to fix that. Might report this to the OTRS developers.) One thing I don’t understand is how this was *not* double- encoded in the old version of MIME tools? Thanks, //mirabilos -- 18:47⎜<mirabilos:#!/bin/mksh> well channels… you see, I see everything in the same window anyway 18:48⎜<xpt:#!/bin/mksh> i know, you have some kind of telnet with automatic pong 18:48⎜<mirabilos:#!/bin/mksh> haha, yes :D 18:49⎜<mirabilos:#!/bin/mksh> though that's more tinyirc – sirc is more comfy
Subject: Re: [rt.cpan.org #123341] Re: Bug#879205: MIME::Words::encode_mimewords: double-encodes (produces Mojibake), produces too long lines
Date: Fri, 20 Oct 2017 19:18:15 -0400
To: bug-MIME-tools [...] rt.cpan.org
From: Dianne Skoll <dfs [...] roaringpenguin.com>
On Fri, 20 Oct 2017 18:01:45 -0400 "Thorsten Glaser via RT" <bug-MIME-tools@rt.cpan.org> wrote: Show quoted text
> One thing I don't understand is how this was *not* double- > encoded in the old version of MIME tools?
I don't understand that either. Maybe it was also an older version of Perl? Perl's UTF-8 handling underwent extensive changes a few years ago. Regards, Dianne.
CC: 879205 [...] bugs.debian.org
Subject: Re: [rt.cpan.org #123341] Re: Bug#879205: MIME::Words::encode_mimewords: double-encodes (produces Mojibake), produces too long lines
Date: Fri, 20 Oct 2017 23:59:29 +0000 (UTC)
To: Dianne Skoll via RT <bug-MIME-tools [...] rt.cpan.org>
From: Thorsten Glaser <tg [...] mirbsd.de>
Dianne Skoll via RT dixit: Show quoted text
>I don't understand that either. Maybe it was also an older version of >Perl? Perl's UTF-8 handling underwent extensive changes a few years ago.
Yes, that was on Debian wheezy. I reported this as bug in Debian against wheezy (which is still supported-ish) first, then as a separate bug against sid because I tried to see if it was still reproducible, and got a different result. Let me dig out version numbers… Original system: otrs2 3.3.18-1~deb7u1 perl 5.14.2-21+deb7u5 libmime-tools-perl 5.503-1 New system: perl 5.26.0-8 libmime-tools-perl 5.508-1 In addition to that, OTRS would have the original subject in a Perl string already, whereas I tried¹ to draft a testcase until I succeeded reproducing the original bug. ① tried, because I don’t really know Perl — I just can program bye, //mirabilos -- (gnutls can also be used, but if you are compiling lynx for your own use, there is no reason to consider using that package) -- Thomas E. Dickey on the Lynx mailing list, about OpenSSL
Not a bug in MIME::tools. Closing.