Skip Menu |

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 66713
Status: resolved
Priority: 0/
Queue: Encode

People
Owner: Nobody in particular
Requestors: trs [...] bestpractical.com
Cc: pali [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: bug-Encode [...] rt.cpan.org
Subject: Re: [rt-users] Email Subject Header creating fragmented strings when decoded
Date: Fri, 18 Mar 2011 10:38:56 -0400
To: rt-users [...] lists.bestpractical.com
From: Thomas Sibley <trs [...] bestpractical.com>
On 18 Mar 2011 10:14, Lars Reimann wrote: Show quoted text
> Hi all, > > the following problem is very annoying: > > RT Encodes Subject lines using the following concept: > > Original example Header > > Subject: > =?UTF-8?B?W3NlcnZpY2UubWV0YXdheXMubmV0ICM2NzAyOF0gU3BlaWNoZXJwbGF0eiBF?= > =?UTF-8?B?cmjDtmh1bmcgd2FzbWFpbjogNTAwIEdC?= > > The header is split into 2 parts: > > 1st part decoded: "[Queue Name #Ticket nubmer] First part of subject line" > 2nd part decoded: "Second part of subject line" > > Completely decoded string: "[Queue Name #Ticket nubmer] First part of > subject line"_"Second part of subject line" > > The underscore (_) marks an additional space character which is > introduced into ALL emails on decoding the two UTF parts.
I think this is actually a bug in Encode::MIME::Header's parsing/generation of the encoded header lines. I tracked it down when it broke a test in other code. I believe it was introduced with the fix for https://rt.cpan.org/Public/Bug/Display.html?id=40027. I've copied this mail to the bug tracker for Encode. Show quoted text
> I double checked with decoding UTF in python. Results: When using 2 UTF > parts, a decode introduces an additional space. When using only ONE > UTF-string (the above subject w/o padding and UTF header) the decode is > done correctly! > > If would be very glad the resolve this problem. If RT could use only one > UTF string, the problem would go away. > How can we do that?
If you're really, really annoyed by it, I believe you can downgrade to an older Encode. But you'll regain other bugs that have been fixed as well, and I can't suggest it. Show quoted text
> And: does anyone have the same problem with email clients (we use > evolution and thunderbird, but most likely other clients are also > affected). > > p.s. It's unclear to me when UTF encoding is used. Sometimes the Subject > line is not UTF encoded and uses ASCII. Perhaps it depends on non-ASCII > characters within the subject.
It's used when there are characters other than ascii in a mail header. Thomas
On Pia mar 18 10:38:59 2011, trs@bestpractical.com wrote: Show quoted text
> On 18 Mar 2011 10:14, Lars Reimann wrote:
> > Hi all, > > > > the following problem is very annoying: > > > > RT Encodes Subject lines using the following concept: > > > > Original example Header > > > > Subject: > > =?UTF-8?B?W3NlcnZpY2UubWV0YXdheXMubmV0ICM2NzAyOF0gU3BlaWNoZXJwbGF0eiBF?= > > =?UTF-8?B?cmjDtmh1bmcgd2FzbWFpbjogNTAwIEdC?= > > > > The header is split into 2 parts: > > > > 1st part decoded: "[Queue Name #Ticket nubmer] First part of subject line" > > 2nd part decoded: "Second part of subject line" > > > > Completely decoded string: "[Queue Name #Ticket nubmer] First part of > > subject line"_"Second part of subject line" > > > > The underscore (_) marks an additional space character which is > > introduced into ALL emails on decoding the two UTF parts.
> > I think this is actually a bug in Encode::MIME::Header's > parsing/generation of the encoded header lines. I tracked it down when > it broke a test in other code. I believe it was introduced with the fix > for https://rt.cpan.org/Public/Bug/Display.html?id=40027. > > I've copied this mail to the bug tracker for Encode. >
> > I double checked with decoding UTF in python. Results: When using 2 UTF > > parts, a decode introduces an additional space. When using only ONE > > UTF-string (the above subject w/o padding and UTF header) the decode is > > done correctly! > > > > If would be very glad the resolve this problem. If RT could use only one > > UTF string, the problem would go away. > > How can we do that?
> > If you're really, really annoyed by it, I believe you can downgrade to > an older Encode. But you'll regain other bugs that have been fixed as > well, and I can't suggest it. >
> > And: does anyone have the same problem with email clients (we use > > evolution and thunderbird, but most likely other clients are also > > affected). > > > > p.s. It's unclear to me when UTF encoding is used. Sometimes the Subject > > line is not UTF encoded and uses ASCII. Perhaps it depends on non-ASCII > > characters within the subject.
> > It's used when there are characters other than ascii in a mail header. > > Thomas
Hi! This problem should be fixed in Encode 2.83.
On Fri Apr 01 15:03:12 2016, PALI wrote: Show quoted text
> On Pia mar 18 10:38:59 2011, trs@bestpractical.com wrote:
> > On 18 Mar 2011 10:14, Lars Reimann wrote:
> > > Hi all, > > > > > > the following problem is very annoying: > > > > > > RT Encodes Subject lines using the following concept: > > > > > > Original example Header > > > > > > Subject: > > > =?UTF- > > > 8?B?W3NlcnZpY2UubWV0YXdheXMubmV0ICM2NzAyOF0gU3BlaWNoZXJwbGF0eiBF?= > > > =?UTF-8?B?cmjDtmh1bmcgd2FzbWFpbjogNTAwIEdC?= > > > > > > The header is split into 2 parts: > > > > > > 1st part decoded: "[Queue Name #Ticket nubmer] First part of > > > subject line" > > > 2nd part decoded: "Second part of subject line" > > > > > > Completely decoded string: "[Queue Name #Ticket nubmer] First part > > > of > > > subject line"_"Second part of subject line" > > > > > > The underscore (_) marks an additional space character which is > > > introduced into ALL emails on decoding the two UTF parts.
> > > > I think this is actually a bug in Encode::MIME::Header's > > parsing/generation of the encoded header lines. I tracked it down > > when > > it broke a test in other code. I believe it was introduced with the > > fix > > for https://rt.cpan.org/Public/Bug/Display.html?id=40027. > > > > I've copied this mail to the bug tracker for Encode. > >
> > > I double checked with decoding UTF in python. Results: When using 2 > > > UTF > > > parts, a decode introduces an additional space. When using only ONE > > > UTF-string (the above subject w/o padding and UTF header) the > > > decode is > > > done correctly! > > > > > > If would be very glad the resolve this problem. If RT could use > > > only one > > > UTF string, the problem would go away. > > > How can we do that?
> > > > If you're really, really annoyed by it, I believe you can downgrade > > to > > an older Encode. But you'll regain other bugs that have been fixed > > as > > well, and I can't suggest it. > >
> > > And: does anyone have the same problem with email clients (we use > > > evolution and thunderbird, but most likely other clients are also > > > affected). > > > > > > p.s. It's unclear to me when UTF encoding is used. Sometimes the > > > Subject > > > line is not UTF encoded and uses ASCII. Perhaps it depends on non- > > > ASCII > > > characters within the subject.
> > > > It's used when there are characters other than ascii in a mail > > header. > > > > Thomas
> > Hi! This problem should be fixed in Encode 2.83.
On Pia Apr 01 15:03:12 2016, PALI wrote: Show quoted text
> Hi! This problem should be fixed in Encode 2.83.
So, please close this bug.