Subject: | body is not decoded according to Content-Type= ... ; charset=... |
When the body method is called, it properly decodes the content
according to the Content-Transfert-Encoding , which gives a binary octet
string.
In case of Content-Type: text/plain; charset=UTF-8 , this method forgets
to decode the binary content into a proper perl string, by doing a
decode($charset , $bincontent) .
I provide a little script to illustrate this an an attachement. You'll
need to have your terminal set to 'UTF-8' to see the effect.
Regards.
Jerome.
Subject: | parse1.pl |
use Email::MIME;
my $email = <<ENDOFMAIL ;
Received: by 10.115.106.19 with HTTP; Thu, 24 Jan 2008 09:10:50 -0800 (PST)
Message-ID: <f516fcd80801240910j797b0e09h26cbafede672403f@mail.gmail.com>
Date: Thu, 24 Jan 2008 17:10:50 +0000
From: "=?UTF-8?Q?J=C3=A9r=C3=B4me_Et=C3=A9v=C3=A9?=" <jerome@eteve.net>
Sender: jerome.eteve@gmail.com
To: "Jerome Eteve" <jerome@careerjet.com>
Subject: =?UTF-8?B?2LbYsdix2Ykg2LfZiNmC2YjZhtmF?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: base64
Content-Disposition: inline
Delivered-To: jerome.eteve@gmail.com
X-Google-Sender-Auth: b14660e8948ad477
2LbYsdix2Ykg2LfZiNmC2YjZhtmFamlkamkgagoKZGppd29qZCB3b3dvd29vd293CgotLSAKSmVy
b21lIEV0ZXZlLgoKU3BlYWsgdG8gbWUgbGl2ZSBhdCBodHRwOi8vd3d3LmV0ZXZlLm5ldAoKamVy
b21lQGV0ZXZlLm5ldAo=
ENDOFMAIL
my $parsed = Email::MIME->new($email);
binmode STDOUT , ':utf8' ;
print 'From: '.$parsed->header('From')."\n" ;
print 'Subject:'.$parsed->header('Subject')."\n" ;
my @parts = $parsed->parts;
foreach my $part ( @parts ){
print "\n-------Part--------\n" ;
print 'ContentType:'.$part->content_type()."\n\n" ;
print $part->body();
}