Bug #5462 for MIME-tools: MIME::Words::encode

Thu Feb 26 08:18:16 2004 jonas [...] paranormal.se - Ticket created

Subject:

MIME::Words::encode_mimewords strips spaces

MIME::Words::encode_mimewords removes the spaces between words in the cases there two words in a row is mime encoded. I'm currently working around this problem by doing: sub encode_mimewords { use MIME::Words; my $string = MIME::Words::encode_mimewords($_[0]); $string =~ s/\?= =\?ISO-8859-1\?Q\?/?= =?ISO-8859-1?Q?_/g; return $string; }

Thu Aug 05 20:34:36 2004 Guest - Correspondence added

From:

christian.jaeger-rtcpanorg [...] ethlife.ethz.ch

[JONAS - Thu Feb 26 08:18:16 2004]: Show quoted text

> MIME::Words::encode_mimewords removes the spaces between words in the > cases there two words in a row is mime encoded.

.. Show quoted text

> $string =~ s/\?= =\?ISO-8859-1\?Q\?/?= =?ISO-8859-1?Q?_/g;

Hm, I've seen too that two mime encoded words (two encoded tokens, only separated by a space like in ...?= =?...) would be viewed without the space when read in Eudora. But then I've noticed that squirrelmail displays that with a space. So I then concluded that it was a bug in Eudora. I've not read the specs, so can't decide what's correct, but just wanted to let you know about the differing client behaviour. Cheers Christian

Thu Aug 12 04:55:09 2004 Guest - Correspondence added

From:

Olivier Salaun [...] cru.fr

Mozilla (1.6) interprets encoded subjects the same way. RFC 1522 seem to indicate that spaces between encoded-words are ignored : ftp://ftp.cru.fr/pub/reseau/RFCs/rfc1522.txt, chapter 5 ... an encoded-word that appears in a header field defined as **text** MUST be separated from any adjacent encoded-word or **text** by linear-white-space. Therefore encoding a string like "word1 word2", should look like "=?charset?Q?encoded_word1_?= =?charset?Q?encoded_word2?= Note the '_' at the end of the encoded_word1 is needed to preserve the white-space between words. [guest - Thu Aug 5 20:34:36 2004]: Show quoted text

> Hm, I've seen too that two mime encoded words (two encoded tokens, > only > separated by a space like in ...?= =?...) would be viewed without the > space when read in Eudora. But then I've noticed that squirrelmail > displays that with a space. So I then concluded that it was a bug in > Eudora. I've not read the specs, so can't decide what's correct, but > just wanted to let you know about the differing client behaviour.

Thu Oct 28 14:29:49 2004 Guest - Correspondence added

From:

Alexey Mahotkin <alexm [...] hsys.msk.ru>

[JONAS - Thu Feb 26 08:18:16 2004]: Show quoted text

> MIME::Words::encode_mimewords removes the spaces between words in the > cases there two words in a row is mime encoded.

Jonas, please try the attached patch. Thank you. Show quoted text

> > I'm currently working around this problem by doing: > > sub encode_mimewords > { > use MIME::Words; > my $string = MIME::Words::encode_mimewords($_[0]); > $string =~ s/\?= =\?ISO-8859-1\?Q\?/?= =?ISO-8859-1?Q?_/g; > return $string; > }

? Makefile ? blib ? encode_mimewords.patch ? pm_to_blib ? testout Index: ChangeLog =================================================================== RCS file: /home/cvs/src/perl-MIME-tools/ChangeLog,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -u -r1.1.1.1 -r1.2 --- ChangeLog 2004/10/28 17:12:16 1.1.1.1 +++ ChangeLog 2004/10/28 18:05:22 1.2 @@ -1,3 +1,7 @@ +2004-10-28 Alexey Mahotkin <alexm:eternal-eval.com> + + * Made encode_mimewords fully compliant to RFC1522 + 2004-10-27 David F. Skoll <dfs@roaringpenguin.com> * VERSION 5.415 RELEASED Index: lib/MIME/Words.pm =================================================================== RCS file: /home/cvs/src/perl-MIME-tools/lib/MIME/Words.pm,v retrieving revision 1.1.1.1 retrieving revision 1.4 diff -u -r1.1.1.1 -r1.4 --- lib/MIME/Words.pm 2004/10/28 17:12:16 1.1.1.1 +++ lib/MIME/Words.pm 2004/10/28 18:05:22 1.4 @@ -267,7 +267,7 @@ I<Function.> Given a RAW string, try to find and encode all "unsafe" sequences -of characters: +of characters, according to RFC1522: ### Encode a string with some unsafe "words": $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB"); @@ -292,13 +292,6 @@ =back -B<Warning:> this is a quick-and-dirty solution, intended for character -sets which overlap ASCII. B<It does not comply with the RFC-1522 -rules regarding the use of encoded words in message headers>. -You may want to roll your own variant, -using C<encoded_mimeword()>, for your application. -I<Thanks to Jan Kasprzak for reminding me about this problem.> - =cut sub encode_mimewords { @@ -306,17 +299,60 @@ my $charset = $params{Charset} || 'ISO-8859-1'; my $encoding = lc($params{Encoding} || 'q'); - ### Encode any "words" with unsafe characters. - ### We limit such words to 18 characters, to guarantee that the - ### worst-case encoding give us no more than 54 + ~10 < 75 characters - my $word; - $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]{1,18})}{ ### get next "word" - $word = $1; - (($word !~ /[$NONPRINT]/o) - ? $word ### no unsafe chars - : encode_mimeword($word, $encoding, $charset)); ### has unsafe chars - }xeg; - $rawstr; + my $safe_chars = "-+*/=_!A-Za-z0-9"; + my $re = "[$safe_chars]"; + my $nre = "[^$safe_chars]"; + + my $result = ""; + my $current = $rawstr; + + while ($current ne "") { + if ($current =~ s/^(([$safe_chars]|\s)+)//) { + # safe chars (w/spaces) are handled as-is + $result .= $1; + next; + } elsif ($current =~ s/^(([^$safe_chars]|\s)+)//) { + # unsafe chars (w/spaces) are encoded + my $unsafe_chars = $1; + CHUNK75: + while ($unsafe_chars ne "") { + + my $full_len = length($unsafe_chars); + my $len = 1; + my $prev_encoded = ""; + + while ($len <= $full_len) { + # we try to encode next beginning of unsafe string + my $possible = substr $unsafe_chars, 0, $len; + my $encoded = encode_mimeword($possible, $encoding, $charset); + + if (length($encoded) < 75) { + # if it could be encoded in specified maximum length, try + # bigger beginning... + $prev_encoded = $encoded; + } else { + # + # ...otherwise, add encoded chunk which still fits, and + # restart with rest of unsafe string + $result .= $prev_encoded; + $prev_encoded = ""; + substr $unsafe_chars, 0, $len - 1, ""; + next CHUNK75; + } + + # if we have reached the end of the string, add final + # encoded chunk + if ($len == $full_len) { + $result .= $encoded; + last CHUNK75; + } + + $len++; + } + } + } + } + return $result; } 1; @@ -331,10 +367,11 @@ MIME::Base64 and MIME::QuotedPrint. -=head1 AUTHOR +=head1 AUTHORS Eryq (F<eryq@zeegee.com>), ZeeGee Software Inc (F<http://www.zeegee.com>). David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com +Alexey Mahotkin (alexm:eternal-eval.com) http://eternal-eval.com/ All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Index: t/Words.t =================================================================== RCS file: /home/cvs/src/perl-MIME-tools/t/Words.t,v retrieving revision 1.1.1.1 retrieving revision 1.4 diff -u -r1.1.1.1 -r1.4 --- t/Words.t 2004/10/28 17:12:16 1.1.1.1 +++ t/Words.t 2004/10/28 18:08:52 1.4 @@ -4,7 +4,7 @@ use ExtUtils::TBone; use MIME::QuotedPrint qw(decode_qp); -use MIME::Words qw(decode_mimewords); +use MIME::Words qw(decode_mimewords encode_mimewords); #------------------------------------------------------------ # BEGIN @@ -12,8 +12,22 @@ # Create checker: my $T = typical ExtUtils::TBone; -$T->begin(10); +# we test each non-empty line in subjects.txt +open WORDS, "<testin/subjects.txt" or die "open: $!"; +my $subjects_count = 0; +while (my $line = <WORDS>) { + next if ($line =~ /^\s*$/); + $subjects_count++; +} +close WORDS; + + +# for each line we do 4 tests: +# whether each line correctly encodes/decodes, twice for each encoding +# whethere encoded chunks are smaller than 75 bytes +$T->begin(10 + $subjects_count * 4); + { local($/) = ''; open WORDS, "<testin/words.txt" or die "open: $!"; @@ -36,6 +50,47 @@ } close WORDS; } + +{ + open WORDS, "<testin/subjects.txt" or die "open: $!"; + while (my $line = <WORDS>) { + chomp $line; + next if ($line =~ /^\s*$/); + + foreach my $encoding (qw(q b)) { + my $encoded = encode_mimewords($line, + Encoding => $encoding, + ); + my $decoded = decode_mimewords($encoded, + Encoding => $encoding, + ); + if ($line eq $decoded) { + # warn "ok: $line\nencoded: $encoded\ndecoded: $decoded\n"; + $T->ok( 1 ); + } else { + + warn "in: $line\nencoded: $encoded\ndecoded: $decoded\n"; + + $T->ok( 0 ); + } + + my $failed_token = ""; + while ($encoded =~ /(=\?[^\?]+\?[bq]\?[^\?]+\?=)/ig) { + if (length($1) > 75) { + $failed_token = $1; + } + } + if ($failed_token ne "") { + warn "failed_token: '$failed_token'"; + $T->ok(0); + } else { + $T->ok(1); + } + } + } + + close WORDS; +} # Done! $T->end; Index: testin/subjects.txt =================================================================== RCS file: subjects.txt diff -N subjects.txt --- /dev/null Wed May 6 00:32:27 1998 +++ /tmp/cvsfOgeLU Thu Oct 28 22:08:59 2004 @@ -0,0 +1,19 @@ +Á +ÁÂ +ÁÂ× +ÁÂ×Ç +ÁÂ×ÇÄ +ÍÁÍÁ ÍÙÌÁ ÒÁÍÕ + +a +ab +abc +abcd +hello world + +hello ÒÕÓÓËÉÊ +hello ÒÕÓÓËÉÊ hello + +ÒÕÓÓËÉÊ a ÒÕÓÓËÉÊ b ÒÕÓÓËÉÊ c ÒÕÓÓËÉÊ d ÒÕÓÓËÉÊ e ÒÕÓÓËÉÊ +ËÁÖÄÙÊ ÏÈÏÔÎÉË ÖÅÌÁÅÔ ÚÎÁÔØ, ÇÄÅ ÓÉÄÉÔ ÆÁÚÁÎ ÓßÅÛØ ÅÝ£ ÜÔÉÈ ÍÑÇËÉÈ ÆÒÁÎÃÕÚÓËÉÊ ÂÕÌÏÞÅË, ÄÁ ×ÙÐÅÊ ÞÁÀ +

Thu Oct 28 14:30:01 2004 Guest - Correspondence added

From:

Alexey Mahotkin <alexm [...] hsys.msk.ru>

[JONAS - Thu Feb 26 08:18:16 2004]: Show quoted text

> MIME::Words::encode_mimewords removes the spaces between words in the > cases there two words in a row is mime encoded.

Jonas, please try the attached patch. Thank you. Show quoted text

> > I'm currently working around this problem by doing: > > sub encode_mimewords > { > use MIME::Words; > my $string = MIME::Words::encode_mimewords($_[0]); > $string =~ s/\?= =\?ISO-8859-1\?Q\?/?= =?ISO-8859-1?Q?_/g; > return $string; > }

? Makefile ? blib ? encode_mimewords.patch ? pm_to_blib ? testout Index: ChangeLog =================================================================== RCS file: /home/cvs/src/perl-MIME-tools/ChangeLog,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -u -r1.1.1.1 -r1.2 --- ChangeLog 2004/10/28 17:12:16 1.1.1.1 +++ ChangeLog 2004/10/28 18:05:22 1.2 @@ -1,3 +1,7 @@ +2004-10-28 Alexey Mahotkin <alexm:eternal-eval.com> + + * Made encode_mimewords fully compliant to RFC1522 + 2004-10-27 David F. Skoll <dfs@roaringpenguin.com> * VERSION 5.415 RELEASED Index: lib/MIME/Words.pm =================================================================== RCS file: /home/cvs/src/perl-MIME-tools/lib/MIME/Words.pm,v retrieving revision 1.1.1.1 retrieving revision 1.4 diff -u -r1.1.1.1 -r1.4 --- lib/MIME/Words.pm 2004/10/28 17:12:16 1.1.1.1 +++ lib/MIME/Words.pm 2004/10/28 18:05:22 1.4 @@ -267,7 +267,7 @@ I<Function.> Given a RAW string, try to find and encode all "unsafe" sequences -of characters: +of characters, according to RFC1522: ### Encode a string with some unsafe "words": $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB"); @@ -292,13 +292,6 @@ =back -B<Warning:> this is a quick-and-dirty solution, intended for character -sets which overlap ASCII. B<It does not comply with the RFC-1522 -rules regarding the use of encoded words in message headers>. -You may want to roll your own variant, -using C<encoded_mimeword()>, for your application. -I<Thanks to Jan Kasprzak for reminding me about this problem.> - =cut sub encode_mimewords { @@ -306,17 +299,60 @@ my $charset = $params{Charset} || 'ISO-8859-1'; my $encoding = lc($params{Encoding} || 'q'); - ### Encode any "words" with unsafe characters. - ### We limit such words to 18 characters, to guarantee that the - ### worst-case encoding give us no more than 54 + ~10 < 75 characters - my $word; - $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]{1,18})}{ ### get next "word" - $word = $1; - (($word !~ /[$NONPRINT]/o) - ? $word ### no unsafe chars - : encode_mimeword($word, $encoding, $charset)); ### has unsafe chars - }xeg; - $rawstr; + my $safe_chars = "-+*/=_!A-Za-z0-9"; + my $re = "[$safe_chars]"; + my $nre = "[^$safe_chars]"; + + my $result = ""; + my $current = $rawstr; + + while ($current ne "") { + if ($current =~ s/^(([$safe_chars]|\s)+)//) { + # safe chars (w/spaces) are handled as-is + $result .= $1; + next; + } elsif ($current =~ s/^(([^$safe_chars]|\s)+)//) { + # unsafe chars (w/spaces) are encoded + my $unsafe_chars = $1; + CHUNK75: + while ($unsafe_chars ne "") { + + my $full_len = length($unsafe_chars); + my $len = 1; + my $prev_encoded = ""; + + while ($len <= $full_len) { + # we try to encode next beginning of unsafe string + my $possible = substr $unsafe_chars, 0, $len; + my $encoded = encode_mimeword($possible, $encoding, $charset); + + if (length($encoded) < 75) { + # if it could be encoded in specified maximum length, try + # bigger beginning... + $prev_encoded = $encoded; + } else { + # + # ...otherwise, add encoded chunk which still fits, and + # restart with rest of unsafe string + $result .= $prev_encoded; + $prev_encoded = ""; + substr $unsafe_chars, 0, $len - 1, ""; + next CHUNK75; + } + + # if we have reached the end of the string, add final + # encoded chunk + if ($len == $full_len) { + $result .= $encoded; + last CHUNK75; + } + + $len++; + } + } + } + } + return $result; } 1; @@ -331,10 +367,11 @@ MIME::Base64 and MIME::QuotedPrint. -=head1 AUTHOR +=head1 AUTHORS Eryq (F<eryq@zeegee.com>), ZeeGee Software Inc (F<http://www.zeegee.com>). David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com +Alexey Mahotkin (alexm:eternal-eval.com) http://eternal-eval.com/ All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Index: t/Words.t =================================================================== RCS file: /home/cvs/src/perl-MIME-tools/t/Words.t,v retrieving revision 1.1.1.1 retrieving revision 1.4 diff -u -r1.1.1.1 -r1.4 --- t/Words.t 2004/10/28 17:12:16 1.1.1.1 +++ t/Words.t 2004/10/28 18:08:52 1.4 @@ -4,7 +4,7 @@ use ExtUtils::TBone; use MIME::QuotedPrint qw(decode_qp); -use MIME::Words qw(decode_mimewords); +use MIME::Words qw(decode_mimewords encode_mimewords); #------------------------------------------------------------ # BEGIN @@ -12,8 +12,22 @@ # Create checker: my $T = typical ExtUtils::TBone; -$T->begin(10); +# we test each non-empty line in subjects.txt +open WORDS, "<testin/subjects.txt" or die "open: $!"; +my $subjects_count = 0; +while (my $line = <WORDS>) { + next if ($line =~ /^\s*$/); + $subjects_count++; +} +close WORDS; + + +# for each line we do 4 tests: +# whether each line correctly encodes/decodes, twice for each encoding +# whethere encoded chunks are smaller than 75 bytes +$T->begin(10 + $subjects_count * 4); + { local($/) = ''; open WORDS, "<testin/words.txt" or die "open: $!"; @@ -36,6 +50,47 @@ } close WORDS; } + +{ + open WORDS, "<testin/subjects.txt" or die "open: $!"; + while (my $line = <WORDS>) { + chomp $line; + next if ($line =~ /^\s*$/); + + foreach my $encoding (qw(q b)) { + my $encoded = encode_mimewords($line, + Encoding => $encoding, + ); + my $decoded = decode_mimewords($encoded, + Encoding => $encoding, + ); + if ($line eq $decoded) { + # warn "ok: $line\nencoded: $encoded\ndecoded: $decoded\n"; + $T->ok( 1 ); + } else { + + warn "in: $line\nencoded: $encoded\ndecoded: $decoded\n"; + + $T->ok( 0 ); + } + + my $failed_token = ""; + while ($encoded =~ /(=\?[^\?]+\?[bq]\?[^\?]+\?=)/ig) { + if (length($1) > 75) { + $failed_token = $1; + } + } + if ($failed_token ne "") { + warn "failed_token: '$failed_token'"; + $T->ok(0); + } else { + $T->ok(1); + } + } + } + + close WORDS; +} # Done! $T->end; Index: testin/subjects.txt =================================================================== RCS file: subjects.txt diff -N subjects.txt --- /dev/null Wed May 6 00:32:27 1998 +++ /tmp/cvsfOgeLU Thu Oct 28 22:08:59 2004 @@ -0,0 +1,19 @@ +Á +ÁÂ +ÁÂ× +ÁÂ×Ç +ÁÂ×ÇÄ +ÍÁÍÁ ÍÙÌÁ ÒÁÍÕ + +a +ab +abc +abcd +hello world + +hello ÒÕÓÓËÉÊ +hello ÒÕÓÓËÉÊ hello + +ÒÕÓÓËÉÊ a ÒÕÓÓËÉÊ b ÒÕÓÓËÉÊ c ÒÕÓÓËÉÊ d ÒÕÓÓËÉÊ e ÒÕÓÓËÉÊ +ËÁÖÄÙÊ ÏÈÏÔÎÉË ÖÅÌÁÅÔ ÚÎÁÔØ, ÇÄÅ ÓÉÄÉÔ ÆÁÚÁÎ ÓßÅÛØ ÅÝ£ ÜÔÉÈ ÍÑÇËÉÈ ÆÒÁÎÃÕÚÓËÉÊ ÂÕÌÏÞÅË, ÄÁ ×ÙÐÅÊ ÞÁÀ +

Wed Sep 27 04:18:53 2006 os [...] cru.fr - Correspondence added

From:

os [...] cru.fr

We are using MIME-tools and MIME::Words in Sympa software (http://www.sympa.org). We were also reported these problem with spaces stripped because Sympa needs to decode and then re-encode Subject mail header fields. This process leads to spaces removed. Here is a short script that demonstrates the problem : #!/usr/bin/perl use MIME::Words qw(:all); $s1 = 'hé hé'; $s2 = encode_mimewords($s1, ('Encode' => 'Q', 'Charset' => 'iso-8859-1')); $s3 = decode_mimewords($s2); printf "S1: %s\nS2: %s\nS3: %s\n", $s1, $s2, $s3; ## here is the output : S1: hé hé /S2: =?ISO-8859-1?Q?h=E9?= =?ISO-8859-1?Q?h=E9?= S3: héhé We look forward to get a patch for this problem.

Wed Sep 27 04:18:54 2006 The RT System itself - Status changed from 'new' to 'open'

Thu Nov 23 11:16:27 2006 octopus [...] nospam.verplant.org - Correspondence added

Hi, we stumbled over this bug today, too. I've prepared a patch for this problem myself, because I impulsive enough to not look into rt.cpan.org prior to solving this problem on my own. The attached patch also fixes the problem of quoted-printable-encoded strings still containing spaces, which is invalid according to RFC1521 and RFC2047. Since this problem appears to be quite old and the fix is simple, I'd apprechiate it if you could pick a solution and apply it. Thank you :) Regards, -octo

diff -pur a/lib/MIME/Words.pm b/lib/MIME/Words.pm --- a/lib/MIME/Words.pm 2006-03-17 22:03:23.000000000 +0100 +++ b/lib/MIME/Words.pm 2006-11-23 16:28:21.000000000 +0100 @@ -117,7 +117,8 @@ sub _decode_Q { # almost, but not exactly, quoted-printable. :-P sub _encode_Q { my $str = shift; - $str =~ s{([_\?\=$NONPRINT])}{sprintf("=%02X", ord($1))}eog; + $str =~ s{([_\?\=\s$NONPRINT])}{sprintf("=%02X", ord($1))}eog; + $str =~ s/=20/_/g; $str; } @@ -306,17 +307,41 @@ sub encode_mimewords { my $charset = $params{Charset} || 'ISO-8859-1'; my $encoding = lc($params{Encoding} || 'q'); - ### Encode any "words" with unsafe characters. - ### We limit such words to 18 characters, to guarantee that the - ### worst-case encoding give us no more than 54 + ~10 < 75 characters - my $word; - $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]{1,18})}{ ### get next "word" - $word = $1; - (($word !~ /[$NONPRINT]/o) - ? $word ### no unsafe chars - : encode_mimeword($word, $encoding, $charset)); ### has unsafe chars - }xeg; - $rawstr; + my $return = ''; + + while ($rawstr =~ m! + ^([^$NONPRINT]*\s+)? # Words that don't need quoting + ( + \S*[$NONPRINT]\S* # Word that needs quoting + (?:\s+\S*[$NONPRINT]\S*)* # More words that need quoting + ) + (.*)$ # Rest of the string, to get around using $'. + !x) # look, no /g modifier! + { + my $match = $2; + + $return .= $1; + $rawstr = $3; + + while ($match) + { + my $i = length ($match); + $i = 68 if ($i > 68); + + # While there is no limit to the length of a multiple-line + # header field, each line of a header field that contains + # one or more 'encoded-word's is limited to 76 characters. + # -- RFC2047 + while (length (encode_mimeword (substr ($match, 0, $i))) > 74) + { + $i--; + } + $return .= encode_mimeword (substr ($match, 0, $i)); + $match = substr ($match, $i); + if ($match) { $return .= ' '; } + } + } + return ($return); } 1;

Mon Dec 10 18:46:36 2007 me+bitcard [...] bogen.net - Correspondence added

From:

martini [...] cpan.org

Hi, I also got this problem! :-((( Anyway, there is a other 2 line fix for that: http://bugs.otrs.org/show_bug.cgi?id=1428#c4 Please fix it in further releases!!! Thx, -Martin

Mon Dec 10 19:04:35 2007 me+bitcard [...] bogen.net - Correspondence added

From:

martini [...] cpan.org

Just the patch for the Words.pm: ----------------------------------------------------- ----------------------------------------------------- --- Words.pm Fri Aug 10 12:29:35 2007 +++ Words.pm Fri Aug 10 13:45:57 2007 @@ -117,7 +117,7 @@ # almost, but not exactly, quoted-printable. :-P sub _encode_Q { my $str = shift; - $str =~ s{([_\?\=$NONPRINT])}{sprintf("=%02X", ord($1))}eog; + $str =~ s{([ _\?\=$NONPRINT])}{sprintf("=%02X", ord($1))}eog; $str; } @@ -310,7 +310,7 @@ ### We limit such words to 18 characters, to guarantee that the ### worst-case encoding give us no more than 54 + ~10 < 75 characters my $word; - $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]{1,18})}{ ### get next "word" + $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]+\s*)}{ ### get next "word" $word = $1; (($word !~ /[$NONPRINT]/o) ? $word ### no unsafe chars On Mon Dec 10 18:46:36 2007, martini2 wrote: Show quoted text

> Hi, > > I also got this problem! :-((( > > Anyway, there is a other 2 line fix for that: > > http://bugs.otrs.org/show_bug.cgi?id=1428#c4 > > > Please fix it in further releases!!! > > Thx, > > -Martin

Tue Dec 11 10:17:02 2007 dmo [...] dmo.ca - Correspondence added

Subject:	Re: [rt.cpan.org #5462] MIME::Words::encode_mimewords strips spaces
Date:	Tue, 11 Dec 2007 10:16:21 -0500
To:	Martin Edenhofer via RT <bug-MIME-tools [...] rt.cpan.org>
From:	"Dave O'Neill" <dmo [...] dmo.ca>

On Mon, Dec 10, 2007 at 06:46:38PM -0500, Martin Edenhofer via RT wrote: Show quoted text

> > Please fix it in further releases!!! >

Thanks for the patch. It will appear in the next release. Dave

Fri Mar 07 11:09:34 2008 dmo+pause [...] dmo.ca - Status changed from 'open' to 'stalled'

Tue Mar 18 16:45:15 2008 dmo+pause [...] dmo.ca - Correspondence added

Patch is in 5.426

Tue Mar 18 16:45:17 2008 The RT System itself - Status changed from 'stalled' to 'open'

Tue Mar 18 16:45:18 2008 dmo+pause [...] dmo.ca - Status changed from 'open' to 'resolved'

Sat Aug 02 09:00:10 2008 me+bitcard [...] bogen.net - Correspondence added

From:

me+bitcard [...] bogen.net

On Tue Mar 18 16:45:15 2008, DONEILL wrote: Show quoted text

> Patch is in 5.426

JFI, Patch is not in v5.427. :( By applying this patch to v5.427 it's working fine again. Just create an utf8 mail by using MIME::Tools with subject "это специальныйсабжект для теста системы тикетов" -=> The generated subject is broken an not readably. See also http://bugs.otrs.org/show_bug.cgi?id=3121 for more information. Feel free for further questions. -Martin

Sat Aug 02 09:00:15 2008 The RT System itself - Status changed from 'resolved' to 'open'

Mon Sep 15 13:39:43 2008 dmo+pause [...] dmo.ca - Correspondence added

Can you provide a short testcase that triggers the problem you're seeing? Cheers, Dave

Wed Apr 28 13:29:16 2010 dmo+pause [...] dmo.ca - Correspondence added

Appears to have been fixed long ago.

Wed Apr 28 13:29:19 2010 dmo+pause [...] dmo.ca - Status changed from 'open' to 'resolved'

Tue Jan 29 10:10:47 2013 mg.pub [...] gmx.net - Correspondence added

From:

mg.pub [...] gmx.net

Am Mi 28. Apr 2010, 13:29:16, DONEILL schrieb: Show quoted text

> Appears to have been fixed long ago.

This bug is still open. Here is a small snippet to reproduce the problem: use MIME::Words; use MIME::WordDecoder; use Encode; my $String = "Служба поддержки"; my $Encoded = MIME::Words::encode_mimewords(Encode::encode('utf-8', $String,), Charset => 'utf-8'); my $Decoded = MIME::WordDecoder::mime_to_perl_string($Encoded); print "$String, $Encoded, $Decoded, " . ($String eq $Decoded ? 'equal' : 'not equal') . "\n"; With the current version 5.503 this will print: Служба поддержки, =?UTF-8?Q? =D0=A1=D0=BB=D1=83=D0=B6=D0=B1=D0=B0=20=D0=BF=D0=BE=D0?= =?UTF-8?Q? =B4=D0=B4=D0=B5=D1=80=D0=B6=D0=BA=D0=B8?=, Служба по\xD0\xB4держки, not equal We worked around this problem as follows: sub encode_mimewords { my ($rawstr, %params) = @_; my $charset = $params{Charset} || 'ISO-8859-1'; my $encoding = lc($params{Encoding} || 'q'); ### Encode any "words" with unsafe characters. ### We limit such words to 18 characters, to guarantee that the ### worst-case encoding give us no more than 54 + ~10 < 75 characters my $word; local $1; # --- # OTRS # --- # 2008-08-02 added patch/workaround for bug in MIME::Words (v5.428, maybe # also higner) # see also: http://rt.cpan.org/Public/Bug/Display.html?id=5462 # http://bugs.otrs.org/show_bug.cgi?id=3121 # $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]+\s*)}{ ### get next "word" # --- $rawstr =~ s{([ a-zA-Z0-9\x7F-\xFF]{1,18})}{ ### get next "word" $word = $1; (($word !~ /(?:[$NONPRINT])|(?:^\s+$)/o) ? $word ### no unsafe chars : encode_mimeword($word, $encoding, $charset)); ### has unsafe chars }xeg; $rawstr =~ s/\?==\?/?= =?/g; $rawstr; }

Tue Jan 29 10:10:49 2013 The RT System itself - Status changed from 'resolved' to 'open'

Tue Jan 29 10:12:44 2013 mg.pub [...] gmx.net - Correspondence added

From:

mg.pub [...] gmx.net

(Activate the line in the commented part instead of the original line, and it works fine.)

Wed Jan 30 15:23:48 2013 dfs [...] roaringpenguin.com - Correspondence added

Subject:	Re: [rt.cpan.org #5462] MIME::Words::encode_mimewords strips spaces
Date:	Wed, 30 Jan 2013 15:23:36 -0500
To:	bug-MIME-tools [...] rt.cpan.org
From:	"David F. Skoll" <dfs [...] roaringpenguin.com>

On Tue, 29 Jan 2013 10:12:44 -0500 " via RT" <bug-MIME-tools@rt.cpan.org> wrote: Show quoted text

> (Activate the line in the commented part instead of the original > line, and it works fine.)

Thanks. I have applied your patch and it will be in the next release of MIME::tools. Regards, David.

Wed Jan 30 16:07:09 2013 dfs+pause [...] roaringpenguin.com - Taken

Wed Jan 30 16:07:26 2013 dfs+pause [...] roaringpenguin.com - Correspondence added

Hi, I've uploaded MIME-tools 5.504 to CPAN. It should appear soon in the module index, and it fixes this bug. Regards, David.

Wed Jan 30 16:07:28 2013 dfs+pause [...] roaringpenguin.com - Status changed from 'open' to 'resolved'

Bug #5462 for MIME-tools: MIME::Words::encode_mimewords strips spaces