Skip Menu |

This queue is for tickets about the Mail-IMAPClient CPAN distribution.

Report information
The Basics
Id: 124172
Status: open
Priority: 0/
Queue: Mail-IMAPClient

People
Owner: Nobody in particular
Requestors: MARKOV [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 3.39
Fixed in: (no value)



Subject: Mailbox names are UTF7
Hi Phil, (you may remember me as previous maintainer of the module) I am so glad you maintain Mail::IMAPClient so well ;-b Nearly never I need IMAP; it is hard to fix things when you do not use it yourself. However... at the moment I have to rewrite code which is based on "our" powerful module. I have an issue. In the code which I have to rework, I read this: use Encode::IMAPUTF7; $mailbox = encode('IMAP-UTF-7', decode_entities $mailbox) if $mailbox; $archiveMailbox = encode('IMAP-UTF-7', decode_entities $archiveMailbox) if $archiveMailbox; Reading http://www.fetchmail.info/Mailbox-Names-UTF7.html I think this code is correct: the perl internal string must be converted explicitly to UTF7. I do not see this in the code. This is *not* possible. encode('IMAP-UTF-7', encode('IMAP-UTF-7', $anything)) So: it is not too straight-forward to solve. However, reading Encode::IMAPUTF7 it seems that "our" module will already break when a mailbox name with /\&.*\-/ is used. This is a clear indicator for being IMAP-UTF-7 encoded. So, what about sub select { my ( $self, $target ) = @_; defined $target or return undef; + my $mailbox = $target =~ /\&.*\-/ ? $target : encode('IMAP-UTF-7', $target); - my $qqtarget = $self->Quote($target); + my $qqtarget = $self->Quote($mailbox); But there are more spots, like "list" to do the reverse. Another (none tricky, less DWIMmy) solution would be to add an option. For instance "UnicodeNames" What do you think?
Subject: Per RFC3501 Mailbox names are 7-bit
(fixing the title... per spec 5.1 they are 7-bit, but as always the devil is in the details) Not to say that we couldn't do something, but for better or for worse, I'd say that historically we have left making a best guess as to "correct behavior" when dealing with a mailbox as an exercise for the developer - along with quite a few other details... Currently any UTF7 handling is not handled within Mail::IMAPClient at all. To only introduce it in one unique case (like select) would seem to be very backwards incompatible and horribly incomplete (looking at the big picture). While we could introduce a flag (but the behavior would probably be disabled by default for backwards compatibility), I'm not convinced it makes sense to do that just yet. Have a strong argument? I wonder how many servers are strict on this vs. not. It doesn't seem like we've had many asking for us to do this, but maybe others are watching and want to chime in.
On Mon Jan 22 17:29:41 2018, PLOBBES wrote: Show quoted text
> (fixing the title... per spec 5.1 they are 7-bit, but as always the > devil is in the details)
They are 7-bit (physical). In application space, that is filled-in with MIME-UTF7 encoded strings: section 5.1.3 So we are both right. I see that rfc5738 describes full utf8 support for IMAP4... which would be nice to have. Show quoted text
> Not to say that we couldn't do something, but for better or for worse, > I'd say that historically we have left making a best guess as to > "correct behavior" when dealing with a mailbox as an exercise for the > developer - along with quite a few other details...
This module as an interesting history. When I rewrote it, it was clear that at least 4 other people had worked on it. I would not have designed the interface like that, but it does work. Show quoted text
> Currently any UTF7 handling is not handled within Mail::IMAPClient at > all. To only introduce it in one unique case (like select) would seem > to be very backwards incompatible and horribly incomplete (looking at > the big picture).
Well... it is the RFC. When I design interfaces for my own modules, I go into extend to hide charset issues for the users. I consider interfacing to the out-side world as my problem, the strings at Perl-side should be "perl internal" (cp1252 or utf8-non-strict) Show quoted text
> While we could introduce a flag (but the behavior would probably be > disabled by default for backwards compatibility), I'm not convinced it > makes sense to do that just yet.
Conforming IMAP4 implementations would currently die on accidental /&.*-/, which is used by the encoding. Therefore, we can detect whether some MIME-UTF7 has already mutilated the name... when it is already MIME-UTF7 and not mutilated, it does not matter that we do it again. So: I think we can do without new flag and still be backwards compatible. But I am not against a flag. Support for full UTF8 would be nice as well, but not needed when UTF7 is in place. Show quoted text
> Have a strong argument? I wonder how many servers are strict on this > vs. not. It doesn't seem like we've had many asking for us to do this, > but maybe others are watching and want to chime in.
I expect that there are some examples on internet which demonstrate the work-arounds needed to encode the folder names and decode the headers correctly... the original programmer of the application code which I have to work with was not capable to figure this out himself. Mail::IMAPTalk is hiding the encodings and decodings. (are we in different time-zones? Mine is CET)
Hi Mark, Show quoted text
> (are we in different time-zones? Mine is CET)
Yes. I'm in EDT/EST. As far as this request goes, do you happen to have a patch or pull request that you'd like to propose along with patches? I'm not opposed to changing something here, but I'd love to see you propose what you feel best gets the job done for you and use that to further the discussion. Does that sound reasonable? If this is something you'd really like to see, and we move relatively quickly, we can see about getting this in the 3.40 release.
Subject: Re: [rt.cpan.org #124172] Mailbox names are UTF7
Date: Thu, 27 Sep 2018 09:12:55 +0200
To: "Phil Pearl (Lobbes) via RT" <bug-Mail-IMAPClient [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Phil Pearl (Lobbes) via RT (bug-Mail-IMAPClient@rt.cpan.org) [180927 01:37]: Show quoted text
> Queue: Mail-IMAPClient > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=124172 > > > As far as this request goes, do you happen to have a patch or pull > request that you'd like to propose along with patches?
Nothing ready, but should not be too hard (although I lack possibilities to test it). Reading the docs again (especially http://www.fetchmail.info/Mailbox-Names-UTF7.html) I think what we should offer an option 'FilenameEncoding' which can be 'MIME-UTF-7', '8BIT', or 'UTF-8'. The correct default would be 'MIME-UTF7', but the backwards compatible is '8BIT' (perl's charset for single-byte data is Windows-1252) There are only a few methods which need to be aware of this. The methods which need to be aware of this: - select - _imap_folder_command (covers create, subscribe, unsubscribe, delete, myrights) - status - folders - copy - subscribed Do you see more? I could produce a patch this week. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
On Thu Sep 27 03:13:19 2018, solutions@overmeer.net wrote: [snip] Show quoted text
> I think what we should offer an option 'FilenameEncoding' which can be > 'MIME-UTF-7', '8BIT', or 'UTF-8'.
I think I'd prefer something more like MailboxNameEncoding or MboxNameEncoding. Show quoted text
> The correct default would be 'MIME-UTF7', but the backwards compatible > is '8BIT' (perl's charset for single-byte data is Windows-1252) > > There are only a few methods which need to be aware of this. The > methods > which need to be aware of this: > - select > - _imap_folder_command > (covers create, subscribe, unsubscribe, delete, myrights) > - status > - folders > - copy > - subscribed > > Do you see more?
With a quick glance, that seems right. Show quoted text
> I could produce a patch this week.
Great. I look forward to seeing what you come up with!