Bug #52600 for Mail-Box: Mail::Message::Field::Full->decode removes blanks between an encoded and an unencoded words

Wed Dec 09 04:59:05 2009 icestar [...] inbox.ru - Ticket created

Subject:

Mail::Message::Field::Full->decode removes blanks between an encoded and an unencoded words

Hello! As I've understood from documentation on <decode> method of Mail::Message::Field::Full module: 'Visible blanks have to be ignored between two encoded words in the text, but not when an encoded word follows or precedes an unencoded word. Phrases and comments are texts.' the example below must keeps a blank between encoded part and address, but it doesn't. use Mail::Message::Field::Full; my $encoded = '=?utf-8?B?0JLQtdGB0LXQu9Cw0Y8g0KDQsNCx0L7RgtCw?= <dima@adriver.ru>'; print Mail::Message::Field::Full->decode($encoded); # Expected: 'Веселая Работа <dima@adriver.ru>' # Got 'Веселая Работа<dima@adriver.ru>'

Wed Dec 09 05:00:26 2009 icestar [...] inbox.ru - Broken in 2.092 added

Wed Dec 09 05:00:26 2009 icestar [...] inbox.ru - Fixed in 2.092 deleted

Wed Dec 09 05:07:18 2009 solutions [...] overmeer.net - Correspondence added

Subject:	Re: [rt.cpan.org #52600] Mail::Message::Field::Full->decode removes blanks between an encoded and an unencoded words
Date:	Wed, 9 Dec 2009 11:06:28 +0100
To:	Dmitry Bigunyak via RT <bug-Mail-Box [...] rt.cpan.org>
From:	Mark Overmeer <solutions [...] overmeer.net>

* Dmitry Bigunyak via RT (bug-Mail-Box@rt.cpan.org) [091209 09:59]: Show quoted text

> Wed Dec 09 04:59:05 2009: Request 52600 was acted upon. > Transaction: Ticket created by Alien > Queue: Mail-Box > Subject: Mail::Message::Field::Full->decode removes blanks between an > encoded and an unencoded words

Show quoted text

> use Mail::Message::Field::Full; > my $encoded = '=?utf-8?B?0JLQtdGB0LXQu9Cw0Y8g0KDQsNCx0L7RgtCw?= > <dima@adriver.ru>'; > print Mail::Message::Field::Full->decode($encoded); > > # Expected: 'Веселая Работа <dima@adriver.ru>' > # Got 'Веселая Работа<dima@adriver.ru>'

Your presumption is correct, is my first response. After a few horrible long working days, I will find some time today to address this issue and your bug-report of last week. One remark: you have to call "decode" only on the strings (so on the comment and on the phrase) separately. You should not pass a whole "address" to it, as you do here. You should use parse() for that, which will decode the components for you. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions drs Mark A.C.J. Overmeer MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net

Wed Dec 09 05:07:19 2009 The RT System itself - Status changed from 'new' to 'open'

Wed Dec 09 05:36:53 2009 icestar [...] inbox.ru - Correspondence added

Show quoted text

> > Your presumption is correct, is my first response. After a few

horrible Show quoted text

> long working days, I will find some time today to address this issue

and Show quoted text

> your bug-report of last week. > > One remark: you have to call "decode" only on the strings (so on the > comment and on the phrase) separately. You should not pass a whole > "address" to it, as you do here. You should use parse() for that, > which will decode the components for you.

This is only the example, in my real project code I don't call "decode" directly. I get message from IMAP folder and try to get "From" field like this: my $from = $msg->study('from'); print "From: '$from'\n"; Debug output from "decode" method of Mail::Message::Field::Full module shows me that it's called for (=?utf-8?B? 0JLQtdGB0LXQu9Cw0Y8g0KDQsNCx0L7RgtCw?= <dima@adriver.ru>) string. So I don't need to parse address string, I want to get it decoded only.

Sun Dec 13 15:45:30 2009 solutions [...] overmeer.net - Correspondence added

CC:	undisclosed-recipients: ;
Subject:	Re: [rt.cpan.org #52600] Mail::Message::Field::Full->decode removes blanks between an encoded and an unencoded words
Date:	Sun, 13 Dec 2009 21:45:08 +0100
To:	Dmitry Bigunyak via RT <bug-Mail-Box [...] rt.cpan.org>
From:	Mark Overmeer <solutions [...] overmeer.net>

* Dmitry Bigunyak via RT (bug-Mail-Box@rt.cpan.org) [091209 09:59]: Show quoted text

> Wed Dec 09 04:59:05 2009: Request 52600 was acted upon. > Subject: Mail::Message::Field::Full->decode removes blanks between an > encoded and an unencoded words > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=52600 > > > use Mail::Message::Field::Full; > my $encoded = '=?utf-8?B?0JLQtdGB0LXQu9Cw0Y8g0KDQsNCx0L7RgtCw?= > <dima@adriver.ru>'; > print Mail::Message::Field::Full->decode($encoded);

I had a trick in this decoding implementation which has bitten me more than once. Now, I have replaced it by a more straight-forward (and probably even not slower) algorithm. The chaned part follows. Regards, MarkOv ------ 8< ---- Mail::Message::Field::Full sub _decoder($$$) { my ($charset, $encoding, $encoded) = @_; $charset =~ s/\*[^*]+$//; # language component not used my $to_utf8 = Encode::find_encoding($charset || 'us-ascii'); $to_utf8 or return $encoded; my $decoded; if($encoding !~ /\S/) { $decoded = $encoded; } elsif(lc($encoding) eq 'q') { # Quoted-printable encoded $encoded =~ s/_/ /g; # specific to mime-fields $decoded = MIME::QuotedPrint::decode_qp($encoded); } elsif(lc($encoding) eq 'b') { # Base64 encoded require MIME::Base64; $decoded = MIME::Base64::decode_base64($encoded); } else { # unknown encodings ignored return $encoded; } $to_utf8->decode($decoded, Encode::FB_DEFAULT); # error-chars -> '?' } sub decode($@) { my $self = shift; my @encoded = split /(\=\?[^?]*\?[bqBQ]?\?[^?]*\?\=)/, shift; my %args = @_; my $is_text = defined $args{is_text} ? $args{is_text} : 1; my @decoded = shift @encoded; while(@encoded) { shift(@encoded) =~ /\=\?([^?\s]*)\?([^?\s]*)\?([^?\s]*)\?\=/; push @decoded, _decoder $1, $2, $3; @encoded or last; # in text, blanks between encoding must be removed, but otherwise kept if($is_text && $encoded[0] !~ m/\S/) { shift @encoded } else { push @decoded, shift @encoded } } join '', @decoded; }

Mon Dec 14 05:11:20 2009 icestar [...] inbox.ru - Correspondence added

Show quoted text

> > I had a trick in this decoding implementation which has bitten me more > than once. Now, I have replaced it by a more straight-forward (and > probably even not slower) algorithm. > > The chaned part follows. > > Regards, > MarkOv

Will this changes appear in next release?

Mon Dec 14 05:13:17 2009 solutions [...] overmeer.net - Correspondence added

Subject:	Re: [rt.cpan.org #52600] Mail::Message::Field::Full->decode removes blanks between an encoded and an unencoded words
Date:	Mon, 14 Dec 2009 11:12:49 +0100
To:	Dmitry Bigunyak via RT <bug-Mail-Box [...] rt.cpan.org>
From:	Mark Overmeer <solutions [...] overmeer.net>

* Dmitry Bigunyak via RT (bug-Mail-Box@rt.cpan.org) [091214 10:11]: Show quoted text

> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=52600 > > > Will this changes appear in next release?

Of course. And if you wish, I will make a release for you. I prefer to collect a few changes into one release. -- MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net

Mon Dec 14 06:02:32 2009 icestar [...] inbox.ru - Correspondence added

On Mon Dec 14 05:13:17 2009, solutions@overmeer.net wrote: Show quoted text

> * Dmitry Bigunyak via RT (bug-Mail-Box@rt.cpan.org) [091214 10:11]:

> > Queue: Mail-Box > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=52600 > > > > > Will this changes appear in next release?

> > Of course. And if you wish, I will make a release for you. I prefer

to Show quoted text

> collect a few changes into one release.

Thanks a lot, but this isn't a critical modification and I can wait for the next release easily.

Thu Dec 24 04:29:46 2009 MARKOV [...] cpan.org - Correspondence added

fixed in 2.093, which is released soon

Thu Dec 24 04:29:47 2009 MARKOV [...] cpan.org - Status changed from 'open' to 'resolved'