Skip Menu |

This queue is for tickets about the MIME-tools CPAN distribution.

Report information
The Basics
Id: 41632
Status: resolved
Priority: 0/
Queue: MIME-tools

People
Owner: Nobody in particular
Requestors: bitcard [...] t.themuffin.net
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 5.412
Fixed in: (no value)



Subject: MIME::Field::ParamVal mishandles mixed RFC2231 and non-RFC2231 parameters
When long parameter values are split per RFC2231 and a non-RFC2231 version is also present, MIME::Field::ParamVal::parse_params will concatenate both versions together. Apple Mail, at least, uses both parameter styles; for example: Content-Disposition: inline; filename="Once upon a time, there was a really long filename who lived in the enchanted Internet"; filename*0="Once upon a time, there was a really long filename who l"; filename*1="ived in the enchanted Internet" parse_params implicitly assigns the non-RFC2231 version to part 0, which is fine, except that any parts with the same sequence number are concatenated together. The resulting parsed filename is "Once upon a time, there was a really long filename who lived in the enchanted InternetOnce upon a time, there was a really long filename who lived in the enchanted Internet" The easiest fix is to have parts with the same sequence number replace each other. RFC2231 doesn't say what should be done when the same number is seen again, but there's no reason anyone reading the RFC should want to use the same sequence number for more than one part either. The attached patch (against 5.427) does exactly this, which fixes the reported bug by preferring the RFC2231-style header when it's present. Test case: use MIME::Field::ParamVal; my ($res, $headerval); $headerval = q{inline; filename="Once upon a time, there was a really long filename who lived in the enchanted Internet"; filename*0="Once upon a time, there was a really long filename who l"; filename*1="ived in the enchanted Internet" }; $res = MIME::Field::ParamVal::parse_params("", $headerval); printf("Filename: %s\n", $res->{filename});
Subject: ParamVal.pm.diff
--- MIME/Field/ParamVal.pm.orig 2008-12-10 16:35:51.488762758 -0500 +++ MIME/Field/ParamVal.pm 2008-12-10 16:35:56.965546083 -0500 @@ -266,7 +266,7 @@ if (!defined($qstr)) { $val = rfc2231decode($val); } - $rfc2231params{$name}{$num} .= $val; + $rfc2231params{$name}{$num} = $val; } else { # Make a fake "part zero" for non-RFC2231 params $rfc2231params{$param}{"0"} = $val;
Ben Winslow via RT wrote: Show quoted text
> When long parameter values are split per RFC2231 and a non-RFC2231 > version is also present, MIME::Field::ParamVal::parse_params will > concatenate both versions together. Apple Mail, at least, uses both > parameter styles; for example: > Content-Disposition: inline; > filename="Once upon a time, there was a really long filename who lived > in the enchanted Internet"; > filename*0="Once upon a time, there was a really long filename who l"; > filename*1="ived in the enchanted Internet"
I would call that a bug in Apple Mail. The (incredibly awful and badly-written RFC 2231) is silent about what to do when you have mixed-style parameters like that. Apple is therefore relying on unspecified behaviour. Show quoted text
> The easiest fix is to have parts with the same sequence number replace > each other. RFC2231 doesn't say what should be done when the same > number is seen again, but there's no reason anyone reading the RFC > should want to use the same sequence number for more than one part > either.
The RFC does explicitly say: "...the mechanism MUST NOT depend on parameter ordering since MIME states that parameters are not order sensitive." So how would we handle this: Content-Disposition: inline; filename*1=" blech"; filename="foo bar"; filename*0="foo"; It so happens that with Apple Mail, filename matches the concatenation of filename*n. But I don't think MIME::Tools should incorporate code to work around broken MIME generators. Regards, David.
From: bitcard [...] t.themuffin.net
On Fri Dec 12 10:36:36 2008, DSKOLL wrote: Show quoted text
> Ben Winslow via RT wrote:
> > When long parameter values are split per RFC2231 and a non-RFC2231 > > version is also present, MIME::Field::ParamVal::parse_params will > > concatenate both versions together. Apple Mail, at least, uses both > > parameter styles; for example:
Show quoted text
> I would call that a bug in Apple Mail. The (incredibly awful and > badly-written RFC 2231) is silent about what to do when you have > mixed-style parameters like that. Apple is therefore relying on
unspecified Show quoted text
> behaviour.
Well, the RFC explicitly states that parameter value continuations are meant to prevent damage to the value by header wrapping. Leading whitespace is fine when wrapping headers, but it might not be fine when gluing together a parameter -- one of the examples in the RFC is an URI (for the rarely used message/external-body type), where whitespace in the URI is obviously wrong. It seems logical to me, then, that the RFC2231-style values should always be preferred when present -- the non-RFC2231 value is only useful for reverse compatibility. While the RFC should definitely have stated this explicitly, I think it's reasonable to treat the non-RFC2231 value this way (and I'm sure this probably seemed obvious when RFC2231 was written and nothing supported it yet.) Show quoted text
> > The easiest fix is to have parts with the same sequence number replace > > each other. RFC2231 doesn't say what should be done when the same > > number is seen again, but there's no reason anyone reading the RFC > > should want to use the same sequence number for more than one part > > either.
> > The RFC does explicitly say: > > "...the mechanism MUST NOT depend on parameter ordering > since MIME states that parameters are not order > sensitive."
Good point. Show quoted text
> So how would we handle this: > > Content-Disposition: inline; > filename*1=" blech"; > filename="foo bar"; > filename*0="foo";
Okay, perhaps simply assigning the non-RFC2231 value to part 0 is naïve -- that's easy enough to fix, and I'll post an updated patch later today. I still think the RFC2231 value should be used ("foo bleh"), as the non-RFC2231 filename may simply be a reduced version of the RFC2231 filename; for example, a file named "genetisch geänderte Mörderkaninchen.jpg" could be sent as: Content-Disposition: inline; filename*0*=iso-8859-1'de'genetisch%20ge%E4nderte%20M%F6rderkan; filename*1="inchen.jpg"; filename="genetisch geaenderte Moerderkaninchen.jpg" In this case, I'd want to use the RFC2231 version if I knew how to parse it. Show quoted text
> It so happens that with Apple Mail, filename matches the concatenation > of filename*n. But I don't think MIME::Tools should incorporate code > to work around broken MIME generators.
We could argue all day about whether Apple Mail's behavior is correct or not, but Apple Mail does behave this way right now, and real people are sending real messages with it. Failure to cope with these messages (especially in a gray area between standards) isn't very robust, and parsing the message differently than MUAs are parsing it doesn't benefit anyone. Show quoted text
> Regards, > > David.
Just for reference, my (very quick) testing with RFC2231-style attachment filenames before, after, and surrounding the non-RFC2231 filename turned up the following: Thunderbird - Always uses the non-RFC2231 header when both are present Claws-mail - Always uses the RFC2231 header when both are present Opera Mail - Uses whichever version appears first (even when mixed) Squirrelmail - No support for RFC2231; no support for filename in Content-Disposition; attachment is untitled unless non-RFC2231 version is present Outlook Express 6.00.2800.1933 - No support for RFC2231; no support for filename in Content-Disposition; attachment is untitled unless non-RFC2231 version is present Outlook 2000 - No support for RFC2231; no support for filename in Content-Disposition; attachment is untitled unless non-RFC2231 version is present It's clear that RFC2231's ambiguity has lead to inconsistent support, but MIME::Field::ParamVal's current behavior isn't in line with the way anyone else seems to be supporting it.
On Fri Dec 12 14:29:05 2008, raineth wrote: Show quoted text
> It seems logical to me, then, that the > RFC2231-style values should always be preferred when present -- the > non-RFC2231 value is only useful for reverse compatibility.
OK. That does seem reasonable (though it's not what your patch does.) We'll look at implementing this behaviour. Regards, David.
Hi, Please try this patch... it ignores the non-RFC-2231 parameter if at least one RFC-2231 parameter is present. It should apply cleanly to MIME-tools-5.427 diff --git a/MANIFEST b/MANIFEST index 0d79335..c93adb3 100644 --- a/MANIFEST +++ b/MANIFEST @@ -73,6 +73,7 @@ t/Ref.t t/Smtpsend.t t/ticket-11901.t t/ticket-37139.t +t/ticket-41632.t t/WordDecoder.t t/WordEncoder.t t/Words.t diff --git a/lib/MIME/Field/ParamVal.pm b/lib/MIME/Field/ParamVal.pm index 715c628..527d0db 100644 --- a/lib/MIME/Field/ParamVal.pm +++ b/lib/MIME/Field/ParamVal.pm @@ -268,13 +268,18 @@ sub parse_params { } $rfc2231params{$name}{$num} .= $val; } else { - # Make a fake "part zero" for non-RFC2231 params - $rfc2231params{$param}{"0"} = $val; + # Make a special part for non-RFC2231 params + $rfc2231params{$param}{"-1"} = $val; } } # Extract reconstructed parameters foreach $param (keys %rfc2231params) { + # If we have RFC2231 parts and a non-RFC2231 part, don't + # use the non-RFC2231 part. + if (exists($rfc2231params{$param}{"-1"}) && scalar(keys(%{$rfc2231params{$param}})) > 1) { + delete($rfc2231params{$param}{"-1"}); + } foreach $part (sort { $a <=> $b } keys %{$rfc2231params{$param}}) { $params{$param} .= $rfc2231params{$param}{$part}; } diff --git a/t/ticket-41632.t b/t/ticket-41632.t new file mode 100644 index 0000000..122f211 --- /dev/null +++ b/t/ticket-41632.t @@ -0,0 +1,22 @@ +use strict; +use warnings; + +use Test::More tests => 4; + +# Handle the case of both RFC-2231 and non-RFC-2231 parameter values. +# In this case, we ignore the non-RFC-2231 parameters + +use MIME::Field::ParamVal; + +my $params; + +$params = MIME::Field::ParamVal->parse_params('inline; filename="foo"; filename*1="ar"; filename*0="b"'); + +is($params->{'_'}, 'inline', 'Got the "inline" right'); +is($params->{'filename'}, 'bar', 'Ignored non-RFC-2331 value if RFC-2231 parameters are present'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename="foo"'); +is($params->{'filename'}, 'foo', 'Got filename if RFC-2231 parameters are absent'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename*1="ar"; filename*0="b"'); +is($params->{'filename'}, 'bar', 'Got RFC-2231 value');
Hi, Sorry... the patch was mangled by my browser. Please try the attached version. Regards, David.
diff --git a/MANIFEST b/MANIFEST index 0d79335..c93adb3 100644 --- a/MANIFEST +++ b/MANIFEST @@ -73,6 +73,7 @@ t/Ref.t t/Smtpsend.t t/ticket-11901.t t/ticket-37139.t +t/ticket-41632.t t/WordDecoder.t t/WordEncoder.t t/Words.t diff --git a/lib/MIME/Field/ParamVal.pm b/lib/MIME/Field/ParamVal.pm index 715c628..527d0db 100644 --- a/lib/MIME/Field/ParamVal.pm +++ b/lib/MIME/Field/ParamVal.pm @@ -268,13 +268,18 @@ sub parse_params { } $rfc2231params{$name}{$num} .= $val; } else { - # Make a fake "part zero" for non-RFC2231 params - $rfc2231params{$param}{"0"} = $val; + # Make a special part for non-RFC2231 params + $rfc2231params{$param}{"-1"} = $val; } } # Extract reconstructed parameters foreach $param (keys %rfc2231params) { + # If we have RFC2231 parts and a non-RFC2231 part, don't + # use the non-RFC2231 part. + if (exists($rfc2231params{$param}{"-1"}) && scalar(keys(%{$rfc2231params{$param}})) > 1) { + delete($rfc2231params{$param}{"-1"}); + } foreach $part (sort { $a <=> $b } keys %{$rfc2231params{$param}}) { $params{$param} .= $rfc2231params{$param}{$part}; } diff --git a/t/ticket-41632.t b/t/ticket-41632.t new file mode 100644 index 0000000..122f211 --- /dev/null +++ b/t/ticket-41632.t @@ -0,0 +1,22 @@ +use strict; +use warnings; + +use Test::More tests => 4; + +# Handle the case of both RFC-2231 and non-RFC-2231 parameter values. +# In this case, we ignore the non-RFC-2231 parameters + +use MIME::Field::ParamVal; + +my $params; + +$params = MIME::Field::ParamVal->parse_params('inline; filename="foo"; filename*1="ar"; filename*0="b"'); + +is($params->{'_'}, 'inline', 'Got the "inline" right'); +is($params->{'filename'}, 'bar', 'Ignored non-RFC-2331 value if RFC-2231 parameters are present'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename="foo"'); +is($params->{'filename'}, 'foo', 'Got filename if RFC-2231 parameters are absent'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename*1="ar"; filename*0="b"'); +is($params->{'filename'}, 'bar', 'Got RFC-2231 value');
From: bitcard [...] t.themuffin.net
On Fri Dec 12 14:51:44 2008, DSKOLL wrote: Show quoted text
> Hi,
Hello, Show quoted text
> Sorry... the patch was mangled by my browser. Please try the attached > version.
I tried your patch, and it applied cleanly and worked as expected. I had already fixed it locally in a different way, so I'm attaching a patch for that as well. It passes all your testcases, but it doesn't check to ensure the RFC2231 value has at least 1 character, which is probably a good idea. Show quoted text
> Regards, > > David.
Thank you! Ben
--- ParamVal.pm.orig 2008-12-10 16:35:51.488762758 -0500 +++ ParamVal.pm 2008-12-12 15:46:48.920758498 -0500 @@ -268,13 +268,14 @@ } $rfc2231params{$name}{$num} .= $val; } else { - # Make a fake "part zero" for non-RFC2231 params - $rfc2231params{$param}{"0"} = $val; + # Assign it directly if it's not a RFC 2231 value + $params{$param} = $val; } } # Extract reconstructed parameters foreach $param (keys %rfc2231params) { + $params{$param} = ""; foreach $part (sort { $a <=> $b } keys %{$rfc2231params{$param}}) { $params{$param} .= $rfc2231params{$param}{$part}; }
Hi, Ben, I like your patch better; it's more efficient. Attached is my final version (which is basically your patch plus some comments and additional test cases.) Regards, David.
diff --git a/MANIFEST b/MANIFEST index 0d79335..c93adb3 100644 --- a/MANIFEST +++ b/MANIFEST @@ -73,6 +73,7 @@ t/Ref.t t/Smtpsend.t t/ticket-11901.t t/ticket-37139.t +t/ticket-41632.t t/WordDecoder.t t/WordEncoder.t t/Words.t diff --git a/lib/MIME/Field/ParamVal.pm b/lib/MIME/Field/ParamVal.pm index 715c628..ca12825 100644 --- a/lib/MIME/Field/ParamVal.pm +++ b/lib/MIME/Field/ParamVal.pm @@ -268,13 +268,19 @@ sub parse_params { } $rfc2231params{$name}{$num} .= $val; } else { - # Make a fake "part zero" for non-RFC2231 params - $rfc2231params{$param}{"0"} = $val; + # Assign non-rfc2231 value directly. If we + # did get a mix of rfc2231 and non-rfc2231 values, + # the non-rfc2231 will be blown away in the + # "extract reconstructed parameters" loop. + $params{$param} = $val; } } # Extract reconstructed parameters foreach $param (keys %rfc2231params) { + # If we got any rfc-2231 parameters, then + # blow away any potential non-rfc-2231 parameter. + $params{$param} = ''; foreach $part (sort { $a <=> $b } keys %{$rfc2231params{$param}}) { $params{$param} .= $rfc2231params{$param}{$part}; } diff --git a/t/ticket-41632.t b/t/ticket-41632.t new file mode 100644 index 0000000..1ef6f04 --- /dev/null +++ b/t/ticket-41632.t @@ -0,0 +1,28 @@ +use strict; +use warnings; + +use Test::More tests => 6; + +# Handle the case of both RFC-2231 and non-RFC-2231 parameter values. +# In this case, we ignore the non-RFC-2231 parameters + +use MIME::Field::ParamVal; + +my $params; + +$params = MIME::Field::ParamVal->parse_params('inline; filename="foo"; filename*1="ar"; filename*0="b"'); + +is($params->{'_'}, 'inline', 'Got the "inline" right'); +is($params->{'filename'}, 'bar', 'Ignored non-RFC-2231 value if RFC-2231 parameters are present'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename="foo"'); +is($params->{'filename'}, 'foo', 'Got filename if RFC-2231 parameters are absent'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename*1="ar"; filename*0="b"'); +is($params->{'filename'}, 'bar', 'Got RFC-2231 value'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename*1="ar"; filename="foo"; filename*0="b"'); +is($params->{'filename'}, 'bar', 'Ignored non-RFC-2231 value if RFC-2231 parameters are present'); + +$params = MIME::Field::ParamVal->parse_params('inline; filename*1="ar"; filename*0="b"; filename="foo"'); +is($params->{'filename'}, 'bar', 'Ignored non-RFC-2231 value if RFC-2231 parameters are present');
From: bitcard [...] t.themuffin.net
On Fri Dec 12 16:19:56 2008, DSKOLL wrote: Show quoted text
> Hi, Ben, > > I like your patch better; it's more efficient. Attached is my > final version (which is basically your patch plus some comments and > additional test cases.)
Excellent! This works for me. Show quoted text
> Regards, > > David.
Thanks again, Ben
Fixed in 5.428 release