Skip Menu |

This queue is for tickets about the Mail-Box CPAN distribution.

Report information
The Basics
Id: 90342
Status: resolved
Priority: 0/
Queue: Mail-Box

People
Owner: Nobody in particular
Requestors: claus [...] soonr.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Problem getting Content-Type and dispositionFilename.
Date: Thu, 14 Nov 2013 00:37:28 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mail::Box maintainers, I have an email with an attachment where $SUBJECT happens: Relevant part of the email source is: Content-Type: multipart/mixed; boundary=Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 Content-Transfer-Encoding: 7bit Subject: Very Long Name of a Document.doc From: claus@soonr.com Date: Wed, 13 Nov 2013 17:20:09 -0500 To: inbox@soonr.com Mime-Version: 1.0 (1.0) X-Mailer: iPhone Mail (11B511) --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 Content-Type: application/msword; name*0="Very Long Name of a Document.d"; name*1=oc Content-Disposition: attachment; filename*0="Very Long Name of a Document.d"; filename*1=oc Content-Transfer-Encoding: base64 0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAAFAAAAOAIAAAAAAAAA ... ... ... --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sent from my iPhone --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715-- Upon recursing over the email - e.g. foreach my $part ($msg->parts('RECURSE')) { print $part->body->type(),"\n"; print $part->body->dispositionFilename,"\n"; } I get: text/plain; charset=us-ascii Use of uninitialized value in print at ./test.perl line 210. application/msword; name*0="Very Long Name of a Document.d"; name*1=oc Use of uninitialized value in print at ./test.perl line 210. text/plain; charset=us-ascii Use of uninitialized value in print at ./test.perl line 210. As far as I understand the syntax in the header implies that the filename*0 and filename*1 should be collapsed into filename="Very Long Name of a Document.doc" (The same goes for the Content-Type). Thanx, Claus. -- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 14 Nov 2013 00:56:45 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131113 23:37]: Show quoted text
> Wed Nov 13 18:37:40 2013: Request 90342 was acted upon. > Transaction: Ticket created by claus@soonr.com > Queue: Mail-Box > Subject: Problem getting Content-Type and dispositionFilename. > > Hi Mail::Box maintainers, > > I have an email with an attachment where $SUBJECT happens: > Relevant part of the email source is: > > --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 > Content-Type: application/msword; > name*0="Very Long Name of a Document.d"; > name*1=oc > Content-Disposition: attachment; > filename*0="Very Long Name of a Document.d"; > filename*1=oc
Those specially endoded fields are not supported when you use "$msg->get()" or "$header->get()", but it should work when you use the "study()" alternatives. Then you get the Mail::Message::Field::Full headers which are slower but complete in their implementation. Can you try that? -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 14 Nov 2013 09:28:01 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mark, Thanx for your response ! If I change the code to foreach my $part ($msg->parts('RECURSE')) { if (defined($part->study('Content-Disposition'))) { print "Content-Disposition := ",$part->study('Content-Disposition'),"\n"; my $f = $part->study('Content-Disposition'); print "Attribute filename := ",$f->attribute('filename'),"\n\n"; } } I get this: Content-Disposition := attachment; filename*0="Very Long Name of a Document.d"; filename*1=oc Attribute filename := [continuation missing]oc Thanx, Claus. P.S. I'm using this version: package Mail::Box::Manager; use vars '$VERSION'; $VERSION = '2.107'; The headers in the email are using TAB (result from cat -tv): --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 Content-Type: application/msword; ^Iname*0="Very Long Name of a Document.d"; ^Iname*1=oc Content-Disposition: attachment; ^Ifilename*0="Very Long Name of a Document.d"; ^Ifilename*1=oc Content-Transfer-Encoding: base64 On Thu, Nov 14, 2013 at 12:57 AM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131113 23:37]:
> > Wed Nov 13 18:37:40 2013: Request 90342 was acted upon. > > Transaction: Ticket created by claus@soonr.com > > Queue: Mail-Box > > Subject: Problem getting Content-Type and dispositionFilename. > > > > Hi Mail::Box maintainers, > > > > I have an email with an attachment where $SUBJECT happens: > > Relevant part of the email source is: > > > > --Apple-Mail-F456E080-E1F2-4159-9A65-88BEAFDC5715 > > Content-Type: application/msword; > > name*0="Very Long Name of a Document.d"; > > name*1=oc > > Content-Disposition: attachment; > > filename*0="Very Long Name of a Document.d"; > > filename*1=oc
> > Those specially endoded fields are not supported when you use > "$msg->get()" or "$header->get()", but it should work when you use the > "study()" alternatives. Then you get the Mail::Message::Field::Full > headers which are slower but complete in their implementation. > > Can you try that? > -- > Regards, > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 14 Nov 2013 09:56:40 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131114 08:28]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > if (defined($part->study('Content-Disposition'))) { > print "Content-Disposition := > ",$part->study('Content-Disposition'),"\n"; > my $f = $part->study('Content-Disposition'); > print "Attribute filename := ",$f->attribute('filename'),"\n\n"; > }
study() is expensive: if(my $cd = $part->study('Content-Disposition')) { print $f->attribute('filename'); } But what should work is (see Mail::Message::Body::Encode) my $fn = $message->body->dispositionFilename; These attribute continuations are implemented but so rarely used that it may very well show some new problems with the code when you use it. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 14 Nov 2013 10:15:35 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mark, Unfortunately - if I in the loop over parts (of $msg) do: if (defined($part->study('Content-Disposition'))) { print "Content-Disposition := ",$part->study('Content-Disposition'),"\n"; my $f = $part->study('Content-Disposition'); print "Attribute filename := ",$f->attribute('filename'),"\n"; } if (defined($msg->body->dispositionFilename)) { my $fn = $msg->body->dispositionFilename; print "dispositionFilename := ",$fn,"\n\n"; } else { print "dispositionFilename not defined\n\n"; } I get: dispositionFilename not defined Content-Disposition := attachment; filename*0="Very Long Name of a Document.d"; filename*1=oc Attribute filename := [continuation missing]oc *dispositionFilename not defined* dispositionFilename not defined But in middle the filename SHOULD be defined. Thanx, Claus. On Thu, Nov 14, 2013 at 10:01 AM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131114 08:28]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > if (defined($part->study('Content-Disposition'))) { > > print "Content-Disposition := > > ",$part->study('Content-Disposition'),"\n"; > > my $f = $part->study('Content-Disposition'); > > print "Attribute filename := ",$f->attribute('filename'),"\n\n"; > > }
> > study() is expensive: > > if(my $cd = $part->study('Content-Disposition')) > { print $f->attribute('filename'); > } > > But what should work is (see Mail::Message::Body::Encode) > > my $fn = $message->body->dispositionFilename; > > These attribute continuations are implemented but so rarely used that > it may very well show some new problems with the code when you use it. > -- > Regards, > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 14 Nov 2013 10:20:30 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131114 09:16]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > if (defined($msg->body->dispositionFilename)) { > my $fn = $msg->body->dispositionFilename; > print "dispositionFilename := ",$fn,"\n\n"; > } else { > print "dispositionFilename not defined\n\n"; > } > > I get: > > dispositionFilename not defined > > Content-Disposition := attachment; filename*0="Very Long Name of a > Document.d"; filename*1=oc > Attribute filename := [continuation missing]oc > *dispositionFilename not defined* > > dispositionFilename not defined > > But in middle the filename SHOULD be defined.
Of course: in your case it is "$part->body->dispositionFilename" My example was for single-part messages. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions drs Mark A.C.J. Overmeer MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 5 Dec 2013 11:42:38 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
One Possible solution is for Mail::Box to use "MIME::EcoEncode::Param" - which can decode those RFC2231 headers. (see http://search.cpan.org/~murataya/MIME-EcoEncode-0.95/lib/MIME/EcoEncode/Param.pod ). Somewhat like this: use Encode; use Param; $str="name*0*=ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004;\n name*1*=.xls"; ($decoded, $param, $charset, $lang, $value) = mime_deco_param($str); print "decoded :: $decoded\n"; print "param :: $param\n"; print "charset :: $charset\n"; print "lang :: $lang\n"; print "value :: $value\n\n"; # Just because my linux is UTF8. if ($charset !~ /utf8/i) { print Encode::encode_utf8($decoded),"\n"; } Thanx, Claus. On Thu, Nov 14, 2013 at 10:20 AM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131114 09:16]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > if (defined($msg->body->dispositionFilename)) { > > my $fn = $msg->body->dispositionFilename; > > print "dispositionFilename := ",$fn,"\n\n"; > > } else { > > print "dispositionFilename not defined\n\n"; > > } > > > > I get: > > > > dispositionFilename not defined > > > > Content-Disposition := attachment; filename*0="Very Long Name of a > > Document.d"; filename*1=oc > > Attribute filename := [continuation missing]oc > > *dispositionFilename not defined* > > > > dispositionFilename not defined > > > > But in middle the filename SHOULD be defined.
> > Of course: in your case it is "$part->body->dispositionFilename" > My example was for single-part messages. > -- > Regards, > > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > drs Mark A.C.J. Overmeer MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 5 Dec 2013 11:47:32 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131205 10:43]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > One Possible solution is for Mail::Box to use "MIME::EcoEncode::Param" - > which can decode those RFC2231 headers.
??? MailBox does fully support these headers, so why whould I need to use an other module? The only problem you had with my example what to call it on the message part, not on the main message. Show quoted text
> > Of course: in your case it is "$part->body->dispositionFilename" > > My example was for single-part messages.
(Although the parsing on these headers is correct, there may still be a problem with using that in dispositionFilename() because I never encountered such encoded filename in real life) MarkOv
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 5 Dec 2013 11:53:48 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
OK - let me see if I understand you point of view. The passing of the email is done correctly according to RFC822 (and descendants) - however it's the responsibility of the consumer of the "header->study" output to reconstruct the "name" in a case where the headers look like: Content-Type: application/msword; name*0="Very Long Name of a Document.d"; name*1=oc Thanx, Claus. On Thu, Dec 5, 2013 at 11:47 AM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131205 10:43]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > One Possible solution is for Mail::Box to use "MIME::EcoEncode::Param" - > > which can decode those RFC2231 headers.
> > ??? > > MailBox does fully support these headers, so why whould I need to > use an other module? The only problem you had with my example > what to call it on the message part, not on the main message. >
> > > Of course: in your case it is "$part->body->dispositionFilename" > > > My example was for single-part messages.
> > (Although the parsing on these headers is correct, there may still > be a problem with using that in dispositionFilename() because I never > encountered such encoded filename in real life) > > MarkOv > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 5 Dec 2013 12:28:49 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131205 10:54]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > The passing of the email is done correctly according to RFC822 (and > descendants) - however it's the responsibility of the consumer of the > "header->study" output to reconstruct the "name"
No, that is not my intention. It should work "out of the box" Long time ago that I looked at the code, and very busy on the moment... Can you try this in lib/Mail/Message/Body/Encode.pm sub dispositionFilename(;$) { my $self = shift; my $raw; my $field; if($field = $self->disposition) - { $raw = $field->attribute('filename') + { $field = $field->study; + $raw = $field->attribute('filename') || $field->attribute('file') || $field->attribute('name'); } if(!defined $raw && ($field = $self->type)) - { $raw = $field->attribute('filename') + { $field = $field->study; + $raw = $field->attribute('filename') || $field->attribute('file') -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 5 Dec 2013 14:03:25 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Yep - that helped ! If I have an email with: Content-Disposition: inline; filename*0*=ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-; filename*1*=2004.xls Then if (defined($part->study('Content-Disposition'))) { print "Content-Disposition := ",$part->study('Content-Disposition'),"\n"; my $f = $part->study('Content-Disposition'); print "Attribute filename := ",$f->attribute('filename'),"\n"; } if (defined($part->body->dispositionFilename)) { my $fn = $part->body->dispositionFilename; print "dispositionFilename := ",$fn,"\n\n"; } Gives me the output: Content-Disposition := inline; filename="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls" Attribute filename := ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls dispositionFilename := ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls Of course now we have to look at the dispositionFilename and get it massaged into an UTF8 string (or whatever target we need) :) Thanx, Claus. On Thu, Dec 5, 2013 at 12:29 PM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131205 10:54]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > The passing of the email is done correctly according to RFC822 (and > > descendants) - however it's the responsibility of the consumer of the > > "header->study" output to reconstruct the "name"
> > No, that is not my intention. It should work "out of the box" > > Long time ago that I looked at the code, and very busy on the moment... > > Can you try this in lib/Mail/Message/Body/Encode.pm > > sub dispositionFilename(;$) > { my $self = shift; > my $raw; > > my $field; > if($field = $self->disposition) > - { $raw = $field->attribute('filename') > + { $field = $field->study; > + $raw = $field->attribute('filename') > || $field->attribute('file') > || $field->attribute('name'); > } > > if(!defined $raw && ($field = $self->type)) > - { $raw = $field->attribute('filename') > + { $field = $field->study; > + $raw = $field->attribute('filename') > || $field->attribute('file') > > -- > Regards, > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 5 Dec 2013 15:54:20 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131205 13:03]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > Yep - that helped !
Show quoted text
> Content-Disposition := inline; > filename="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls"
The decoding of the %FC encoding should happen automatically as well. It is tested in tests/14fieldu/20attr.t When I look at this: my $base; if(!defined $raw || !length $raw) {} elsif(index($raw, '?') >= 0) { eval 'require Mail::Message::Field::Full'; $base = Mail::Message::Field::Full->decode($raw); } else { $base = $raw; } Probably we should always call decode(). Can you try - elsif(index($raw, '?') >= 0) - { eval 'require Mail::Message::Field::Full'; $base = Mail::Message::Field::Full->decode($raw); - else - { $base = $raw; - } -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Fri, 6 Dec 2013 09:15:34 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Unfortunately it looks like Mail::Message::Field::Full->decode() cannot take care of those "rfc2231" style encoded strings. It looks to me like it can only handle rfc2047 style encoded strings via the code snippet: shift(@encoded) =~ /\=\?([^?\s]*)\?([^?\s]*)\?([^?]*)\?\=/; i.e. =?iso-8859-1?q?this=20is=20some=20text?= works ... but not ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls Thanx, Claus. On Thu, Dec 5, 2013 at 3:54 PM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131205 13:03]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > Yep - that helped !
>
> > Content-Disposition := inline; > >
> filename="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls" > > The decoding of the %FC encoding should happen automatically as well. > It is tested in tests/14fieldu/20attr.t > > When I look at this: > > my $base; > if(!defined $raw || !length $raw) {} > elsif(index($raw, '?') >= 0) > { eval 'require Mail::Message::Field::Full'; > $base = Mail::Message::Field::Full->decode($raw); > } > else > { $base = $raw; > } > > Probably we should always call decode(). > Can you try > - elsif(index($raw, '?') >= 0) > - { eval 'require Mail::Message::Field::Full'; > $base = Mail::Message::Field::Full->decode($raw); > - else > - { $base = $raw; > - } > -- > Regards, > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Fri, 6 Dec 2013 10:27:42 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [131206 08:15]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > Unfortunately it looks like Mail::Message::Field::Full->decode() cannot > take care of those "rfc2231" style encoded strings.
Ah... I start to remember... Show quoted text
> i.e. =?iso-8859-1?q?this=20is=20some=20text?= works ... but not > ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-2004.xls
The content of attribute fields need to be encoded with =???= and the other parts of the mime lines with '' But, I do not remember if I ever understood why there are two different encodings. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
From: unknown.vagrant [...] gmail.com
Some clients give name for part of filename with index or without index ------------------------ Content-Type: application/pdf; name="=?KOI8-R?Q?22=F2=C5=DA=C5=D2=D7_=D0=C9=D3=D8=CD=CF_-=F4=FA=2Epdf?=" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0*=koi8-r''111%F2%C5%DA%C5%D2%D7%20%D0%C9%D3%D8%CD%CF%20%2D%F4%FA%2E; filename*1*=%70%64%66 ------------------------ i have wrong filename in result because you dont have mechanism for understand this header. I wrote an implementation, you can modify if you want sub dispositionFilename(;$) { my $self = shift; my $raw; my $field; if ($field = $self->disposition()) { use Data::Dumper; # Filename have more then one part - filename* # rfc6266 section-4.3 # Warning! # Some clients give name for part of filename with index - filename*0*, filename*1* etc # See also http://stackoverflow.com/questions/93551/how-to-encode-the-filename-parameter-of-content-disposition-header-in-http if ($field =~ /filename\*/) { my @raws; my $charset; # can be filename*0* + filename*1* or filename* + filename*0* while($field =~ /filename\*\d{0,}\*?\=\s?([^\']*?'?'?[^?;]*)/g) { push @raws, $1; } # parts foreach my $name (@raws) { # decode url encoded data $name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/ge; # filename part have charset if (index($name, "''") > 0) { $name =~ /([^\']+)''(.+)/; $charset = $1; $raw .= Encode::decode($charset, $2); } else { $raw .= $name; } } # For debug #die(Encode::encode('cp866', $filename)); } else { $field = $field->study if $field->can('study'); $raw = $field->attribute('filename'); $raw ||= $field->attribute('file'); $raw ||= $field->attribute('name'); } } if(!defined $raw && ($field = $self->type)) { $field = $field->study if $field->can('study'); $raw = $field->attribute('filename') || $field->attribute('file') || $field->attribute('name'); } .... remainder unchanged ... }
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Wed, 27 Aug 2014 22:54:38 +0200
To: "https://www.google.com/accounts/o8/id?id=AItOawm_jIKbpgLwnIMuSXsD6UZv_dqlnUmwaqg via RT" <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* https://www.google.com/accounts/o8/id?id=AItOawm_jIKbpgLwnIMuSXsD6UZv_dqlnUmwaqg via RT (bug-Mail-Box@rt.cpan.org) [140822 15:38]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > Some clients give name for part of filename with index or without index > i have wrong filename in result because you dont have mechanism for > understand this header.
lib/Mail/Message/Field/Attribute.pm implements this, tested in tests/14fieldu/20attr.t Why does my implementation not work in your case? -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions drs Mark A.C.J. Overmeer MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Wed, 25 Feb 2015 21:36:45 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mark, I can see that Mail/Message/Body/Encode.pm (v2.117) is using: - $field = $field->study if $field->can('study') in "sub dispositionFilename". However - I still see that when emails come in with headers like: Content-Disposition: inline; filename*0="Selling #1 (signed) - 11-13.p"; filename*1=df I get this back from dispositionFilename: - [continuation missing]df "continuation missing" seems to come from: - - sub decode() in Mail/Message/Field/Attribute.pm Thanx, Claus. On Wed, Aug 27, 2014 at 10:54 PM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * > https://www.google.com/accounts/o8/id?id=AItOawm_jIKbpgLwnIMuSXsD6UZv_dqlnUmwaqg > via RT (bug-Mail-Box@rt.cpan.org) [140822 15:38]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > Some clients give name for part of filename with index or without index > > i have wrong filename in result because you dont have mechanism for > > understand this header.
> > lib/Mail/Message/Field/Attribute.pm implements this, tested in > tests/14fieldu/20attr.t Why does my implementation not work in > your case? > -- > Regards, > > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > drs Mark A.C.J. Overmeer MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 26 Feb 2015 12:56:14 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [150225 20:37]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > However - I still see that when emails come in with headers like: > Content-Disposition: inline; > filename*0="Selling #1 (signed) - 11-13.p"; > filename*1=df > > I get this back from dispositionFilename: > - [continuation missing]df
Fixed. On the ::Field::Structured level, each field attribute was processed separately. So, these continuations were not merged into one ::Field::Attribute object. Now it works. Just released 2.118 to CPAN. -- thanks for the report, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 26 Feb 2015 13:46:41 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mark, Thanx, The collapse of the multiline filename is now working. However, there is still the issue of decoding the strings (rfc 2231) - e.g. Content-Disposition: inline; filename*0*="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-"; filename*1*="2004.xls" Which with "body->dispositionFilename" gives us: - R�ckstellung DB, DZ u. KommSt 2001-2004.xls But the correct decode (German &uuml;) is: - Rückstellung DB, DZ u. KommSt 2001-2004.xls I think one can borrow inspiration from the implementation of "sub mime_deco_param {}" in http://cpansearch.perl.org/src/MURATAYA/MIME-EcoEncode-0.95/lib/MIME/EcoEncode/Param.pm which also seems to take care of rfc 2047 style encoding. Thanx, Claus. On Thu, Feb 26, 2015 at 12:56 PM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [150225 20:37]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > However - I still see that when emails come in with headers like: > > Content-Disposition: inline; > > filename*0="Selling #1 (signed) - 11-13.p"; > > filename*1=df > > > > I get this back from dispositionFilename: > > - [continuation missing]df
> > Fixed. On the ::Field::Structured level, each field attribute was > processed separately. So, these continuations were not merged into > one ::Field::Attribute object. > > Now it works. Just released 2.118 to CPAN. > -- > thanks for the report, > > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 26 Feb 2015 14:03:36 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [150226 12:47]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > filename*0*="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-"; > filename*1*="2004.xls" > > Which with "body->dispositionFilename" gives us: > - R�ckstellung DB, DZ u. KommSt 2001-2004.xls
I have added this to the test-script use utf8; my $h2 = Mail::Message::Field::Full->new('Content-Disposition' => q{inline; filename*0*="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-"; filename*1*="2004.xls"}); is($h2->attribute('filename'), 'Rückstellung DB, DZ u. KommSt 2001-2004.xls'); ... and that produces a success. May have something to do which your charset of stdout. Or am I wrong? You can see that I already support decoding, so why refer to that other module? -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 26 Feb 2015 14:28:16 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mark, Trying this: my $dispositionFilename = $part->body->dispositionFilename; print "dispositionFilename :: ",$dispositionFilename,"\n"; print "encoding ...........:: ",encode("utf8",$dispositionFilename),"\n"; results in: dispositionFilename :: R�ckstellung DB, DZ u. KommSt 2001-2004.xls encoding ...........:: Rückstellung DB, DZ u. KommSt 2001-2004.xls Using Centos6 , perl 5.10.1, and ENV has LC_CTYPE=en_US.UTF-8. So there is something with utf8 (adding "use utf8;" changes nothing), Thanx, Claus. On Thu, Feb 26, 2015 at 2:03 PM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [150226 12:47]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > >
> filename*0*="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-";
> > filename*1*="2004.xls" > > > > Which with "body->dispositionFilename" gives us: > > - R�ckstellung DB, DZ u. KommSt 2001-2004.xls
> > I have added this to the test-script > > use utf8; > my $h2 = Mail::Message::Field::Full->new('Content-Disposition' => > q{inline; > > filename*0*="ISO-8859-15''R%FCckstellung%20DB%2C%20DZ%20u.%20KommSt%202001-"; > filename*1*="2004.xls"}); > is($h2->attribute('filename'), 'Rückstellung DB, DZ u. KommSt > 2001-2004.xls'); > > ... and that produces a success. May have something to do which your > charset of stdout. Or am I wrong? > > You can see that I already support decoding, so why refer to that other > module? > -- > Regards, > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 26 Feb 2015 15:19:17 +0100
To: Claus Jeppesen via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [150226 13:28]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > my $dispositionFilename = $part->body->dispositionFilename; > print "dispositionFilename :: ",$dispositionFilename,"\n"; > print "encoding ...........:: ",encode("utf8",$dispositionFilename),"\n";
Your source encoding is ISO-8859-15. In recent versions of Encode, that may lead to a string which does not have the utf8 flag on (latin1) When you output latin1 to a utf8 terminal, you get an � on some characters. When you explicitly encode it into utf8, and then output it to a utf8 terminal, it will work. This may help: use open OUT => ':utf8'; Show quoted text
> So there is something with utf8 (adding "use utf8;" changes nothing),
"use utf8" means that the perl source file is written in utf8, not latin1. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #90342] Problem getting Content-Type and dispositionFilename.
Date: Thu, 26 Feb 2015 15:40:10 +0100
To: bug-Mail-Box [...] rt.cpan.org
From: Claus Jeppesen <claus [...] soonr.com>
Hi Mark, Thanx for the hint - it's definitely a Perl meta-data issue ! The md5sum on the 2 variables are identical - but prints differently :) Claus. On Thu, Feb 26, 2015 at 3:19 PM, Mark Overmeer via RT < bug-Mail-Box@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > * Claus Jeppesen via RT (bug-Mail-Box@rt.cpan.org) [150226 13:28]:
> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=90342 > > > > > my $dispositionFilename = $part->body->dispositionFilename; > > print "dispositionFilename :: ",$dispositionFilename,"\n"; > > print "encoding ...........:: ",encode("utf8",$dispositionFilename),"\n";
> > Your source encoding is ISO-8859-15. In recent versions of Encode, that > may lead to a string which does not have the utf8 flag on (latin1) When > you output latin1 to a utf8 terminal, you get an � on some characters. > > When you explicitly encode it into utf8, and then output it to a > utf8 terminal, it will work. This may help: > > use open OUT => ':utf8'; >
> > So there is something with utf8 (adding "use utf8;" changes nothing),
> > "use utf8" means that the perl source file is written in utf8, not > latin1. > -- > Regards, > MarkOv > > ------------------------------------------------------------------------ > Mark Overmeer MSc MARKOV Solutions > Mark@Overmeer.net solutions@overmeer.net > http://Mark.Overmeer.net http://solutions.overmeer.net > > >
-- *Claus Jeppesen* | Director of Network Servicesclaus@soonr.com | www.soonr.com c +45 6170 5901