Subject: | encode('MIME-Header') does not find word boundaries correctly |
Encoding a string which contains a colon (or, presumably, other header specials) can produce invalid output, because the resulting encoded-words are not bounded by whitespace. For example, this:
print encode('MIME-Header', "Hey foo\x{2024}bar:whee")."\n";
produces this:
=?UTF-8?B?SGV5IGZvb+KApGJhcg==?=:whee
which is invalid because there is no space between the encoded-word and the colon. RFC2047 makes this fairly clear in section 5, where it describes the three places you can use an encoded-word; in each of the three, it says "an 'encoded-word' that appears in [that place] MUST be separated from any adjacent [stuff] by 'linear-white-space'.".
For encoding a Subject: header or other "*text" field, I think there are only two valid places to have an encoded-word boundary: either between two successive encoded-words (in which case the separating whitespace is stripped by the decoder) or at a place where an encoded word is separated from a non-encoded word by whitespace (in which case the whitespace is not stripped by the decoder).