Bug #111419 for Encode: incorrect unfolding and other decoding bugs

Mon Jan 25 11:40:29 2016 florz [...] florz.de - Ticket created

Subject:	incorrect unfolding and other decoding bugs
Date:	Mon, 25 Jan 2016 17:40:08 +0100
To:	bug-Encode [...] rt.cpan.org
From:	Florian Zumbiehl <florz [...] florz.de>

Bug #67569 has been marked as resolved even though the bug obviously hasn't been fixed. For explanation and patch, see there: https://rt.cpan.org/Public/Bug/Display.html?id=67569

Mon Jan 25 12:31:03 2016 DANKOGAI [...] cpan.org - Correspondence added

On Mon Jan 25 11:40:29 2016, florz@florz.de wrote: Show quoted text

> Bug #67569 has been marked as resolved even though the bug obviously hasn't > been fixed. For explanation and patch, see there: > > https://rt.cpan.org/Public/Bug/Display.html?id=67569

I've added all test vectors in RFC2047, pp.11-12 https://github.com/dankogai/p5-encode/commit/b0d30fbf504b69caad5cc43b5c25c9a3b0166fa1 And Encode 2.80 passes all that. Plus so far as I see some of your test vectors are wrong. For instance "=?us-ascii?q?foo?=\r\n bar" => "foo bar" Should be "foobar", not "foo bar". That is a single line when decoded and in such cases the leading white spaces immediately after CRLF is ignored. Dan

Mon Jan 25 12:31:03 2016 The RT System itself - Status changed from 'new' to 'open'

Mon Jan 25 12:57:28 2016 florz [...] florz.de - Correspondence added

Subject:	Re: [rt.cpan.org #111419] incorrect unfolding and other decoding bugs
Date:	Mon, 25 Jan 2016 18:57:14 +0100
To:	Dan Kogai via RT <bug-Encode [...] rt.cpan.org>
From:	Florian Zumbiehl <florz [...] florz.de>

Hi, Show quoted text

> Plus so far as I see some of your test vectors are wrong. For instance

Yes, I think some are, and as I wrote a long time ago, I'd be happy to fix those if there is any chance that the fix will actually get merged. Show quoted text

> "=?us-ascii?q?foo?=\r\n bar" => "foo bar" > > Should be "foobar", not "foo bar". That is a single line when decoded and in such cases the leading white spaces immediately after CRLF is ignored.

That one is actually correct. Could you point to where in the relevant RFCs you think the behavior you suggest is specified? You might want to start reading at sections 2.2.3 and 3.2.2 of RFC 5322, the former of which this quote is from: | The process of moving from this folded multiple-line representation | of a header field to its single line representation is called | "unfolding". Unfolding is accomplished by simply removing any CRLF | that is immediately followed by WSP. Each header field should be | treated in its unfolded form for further syntactic and semantic | evaluation. An unfolded header field has no length restriction and | therefore may be indeterminately long. Regards, Florian

Thu Jan 28 20:54:51 2016 florz [...] florz.de - Correspondence added

Subject:	Re: [rt.cpan.org #111419] incorrect unfolding and other decoding bugs
Date:	Fri, 29 Jan 2016 02:54:30 +0100
To:	Dan Kogai via RT <bug-Encode [...] rt.cpan.org>
From:	Florian Zumbiehl <florz [...] florz.de>

Hi, BTW, I just checked, these are the two that arguably are wrong: "=?us-ascii?q?foo?==?us-ascii?q?bar?=" => "foo=?us-ascii?q?bar?=" "foo =?us-ascii?q?=20?==?us-ascii?q?bar?=" => "foo =?us-ascii?q?bar?=" Though it's not really clear whether they are indeed wrong, as RFC2047 is somewhat ambiguous there. All the other examples are definitely correct. So, is there any chance you will finally merge the fix? Regards, Florian

Tue Mar 29 14:35:02 2016 pali [...] cpan.org - Cc PALI added

Tue Mar 29 14:36:15 2016 pali [...] cpan.org - Correspondence added

I suppose that this bug is finnaly fixed in version 2.83

Thu Apr 14 08:37:30 2016 DANKOGAI [...] cpan.org - Correspondence added

On Tue Mar 29 14:36:15 2016, PALI wrote: Show quoted text

> I suppose that this bug is finnaly fixed in version 2.83

Thu Apr 14 08:37:33 2016 DANKOGAI [...] cpan.org - Status changed from 'open' to 'resolved'