Skip Menu |

This queue is for tickets about the Mail-Box CPAN distribution.

Report information
The Basics
Id: 52278
Status: resolved
Priority: 0/
Queue: Mail-Box

People
Owner: Nobody in particular
Requestors: icestar [...] inbox.ru
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 2.092
Fixed in: (no value)



Subject: Misunderstandings with utf-8 encoded header fields and body
Hello, again! This is not a bug report, only a question. I'm trying to send a message with utf-8 header fields and body. Below there are some steps of my cording. To set 'To' header field in my perl script I write: my $to_name = 'Тестовый Ящик'; Encode::_utf8_on($to_name); my $to_address = Mail::Message::Field::Address->new( address => 'test@mailbox.ru', phrase => $to_name, charset => 'utf-8', encoding => 'B', ); Then I create a header from such fields. The second step is to create the body of message. Code: my $text = "Сейчас я расскажу вам одну очень старую сказку."; Encode::_utf8_on($text); my $body = Mail::Message::Body->new( data => $text, mime_type => 'text/plain', ); To create the message I write something like this: my $message = Mail::Message->build( $body, head => $head, file => "./test.xml", file => "./test.zip", ); And now my questions: 1) Do I have to write Encode::_utf8_on($to_name) and Encode::_utf8_on ($text) or not? From documentation I understand that it should work correctly with PERL encoding. 2) You didn't set mime_type to default value 'text/plain' when creating body with 'data' option as says in documentation and maybe this is a bug.
Please send questions to the XML-Compile mailing-list (preferred) or IRC channel. RT is for bug-reports and feature requests. On Mon Nov 30 09:56:58 2009, Alien wrote: Show quoted text
> Hello, again! > This is not a bug report, only a question. > I'm trying to send a message with utf-8 header fields and body. Below > there are some steps of my cording. > To set 'To' header field in my perl script I write: > > my $to_name = 'Тестовый Ящик'; > Encode::_utf8_on($to_name); > my $to_address = Mail::Message::Field::Address->new( > address => 'test@mailbox.ru', > phrase => $to_name, > charset => 'utf-8', > encoding => 'B', > ); > > Then I create a header from such fields. The second step is to create > the body of message. Code: > > my $text = "Сейчас я расскажу вам одну очень старую сказку."; > Encode::_utf8_on($text); > my $body = Mail::Message::Body->new( > data => $text, > mime_type => 'text/plain', > ); > > To create the message I write something like this: > > my $message = Mail::Message->build( > $body, > head => $head, > file => "./test.xml", > file => "./test.zip", > ); > > And now my questions: > 1) Do I have to write Encode::_utf8_on($to_name) and Encode::_utf8_on > ($text) or not? From documentation I understand that it should work > correctly with PERL encoding. > 2) You didn't set mime_type to default value 'text/plain' when creating > body with 'data' option as says in documentation and maybe this is a > bug.
On Tue Dec 01 05:57:13 2009, MARKOV wrote: Show quoted text
> Please send questions to the XML-Compile mailing-list (preferred) or
IRC Show quoted text
> channel. RT is for bug-reports and feature requests. > >
Maybe my second question is a bug, but ok. Why XML-Compile mailing-list or you've make a mistake? I've found on your site (http://perl.overmeer.net/mailbox/) address (mailbox- subscribe@perl.overmeer.net) to join the Mail-Box mailing-list, is it correct?
Subject: Re: [rt.cpan.org #52278] Misunderstandings with utf-8 encoded header fields and body
Date: Tue, 1 Dec 2009 13:18:10 +0100
To: Dmitry Bigunyak via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Dmitry Bigunyak via RT (bug-Mail-Box@rt.cpan.org) [091201 11:31]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=52278 > > > On Tue Dec 01 05:57:13 2009, MARKOV wrote:
> > Please send questions to the XML-Compile mailing-list (preferred) or
> IRC
> > channel. RT is for bug-reports and feature requests.
> > Maybe my second question is a bug, but ok. > Why XML-Compile mailing-list or you've make a mistake? I've found on > your site (http://perl.overmeer.net/mailbox/) address (mailbox- > subscribe@perl.overmeer.net) to join the Mail-Box mailing-list, is it > correct?
Oh, I got confused... handling multiple RT queues at the same time. Too much things in my head. I'll take a look at it asap, which is not today. Sorry. -- MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
I had not uploaded the documentation of my modules to my website for nearly a year. That's why the reference to the mailinglist was still there. It was already removed from the manual pages, because the amount of unfiltered spam was much higher than the useful traffic. Show quoted text
> 1) Do I have to write Encode::_utf8_on($to_name) and Encode::_utf8_on > ($text) or not? From documentation I understand that it should work > correctly with PERL encoding.
That's not the right was. This way, you force Perl to treat the string as being an utf-8 encoded string. However, it is either already is an utf-8 string or it is a latin-2(?) encoded string. (Can't see that from the bug-report) When your file is utf-8, then your Perl program should start with use utf8; and everything is ok. Or you should say use Encode; my $x = decode 'latin2', $bytes; Internally, Perl uses "latin1" and "UTF8" (not real utf-8). If you want to do things right, you must encode/decode text on all entrance and exit points of your program. Either with "encode/decode" or like this: open IN, '<:encoding(latin2)', $fn or die; Show quoted text
> 2) You didn't set mime_type to default value 'text/plain' when > creating body with 'data' option as says in documentation > and maybe this is a bug.
I do not read than from the docs. I see "data are lines", but that also holds for "text/html" and hundreds of other text/* types.
Show quoted text
> > 1) Do I have to write Encode::_utf8_on($to_name) and
Encode::_utf8_on Show quoted text
> > ($text) or not? From documentation I understand that it should work > > correctly with PERL encoding.
> > That's not the right was. This way, you force Perl to treat the
string Show quoted text
> as being an utf-8 encoded string. However, it is either already is an > utf-8 string or it is a latin-2(?) encoded string. (Can't see that
from Show quoted text
> the bug-report) > > When your file is utf-8, then your Perl program should start with > use utf8; > and everything is ok. Or you should say > use Encode; > my $x = decode 'latin2', $bytes; > > Internally, Perl uses "latin1" and "UTF8" (not real utf-8). If you > want to do things right, you must encode/decode text on all entrance
and Show quoted text
> exit points of your program. Either with "encode/decode" or like
this: Show quoted text
> open IN, '<:encoding(latin2)', $fn or die;
I'm writing code in utf-8 console and don't use pragma utf8 because I don't need UTF-8 source code, only scalar value. So, the scalar my $to_name = 'Тестовый Ящик'; perl stores in his internal UTF8 encoding. And when I write Encode::_utf8_on($to_name) I tell perl that this is real well formated utf-8 string. To create 'To' header field I write my $to_address = Mail::Message::Field::Address->new( address => 'test@mailbox.ru', phrase => $to_name, charset => 'utf-8', encoding => 'B', ); I set charset to 'utf-8' to tell that phrase in utf-8 encoding. And here everything is right. To create utf-8 message body I've found another way my $text = "тестовая строка"; # scalar in internal UTF8 perl encoding my $body = Mail::Message::Body->new( data => $text, mime_type => 'text/plain', charset => 'utf-8', # don't understand why not 'PERL' ); $body = $body->encoded; Why I should set charset attribute to 'utf-8' instead of 'PERL'? In documentation says this option 'Defines the character-set which is used in the data' and in real this is 'PERL'. Can you show me the right way of creating utf-8 encoded body, please? Show quoted text
>
> > 2) You didn't set mime_type to default value 'text/plain' when > > creating body with 'data' option as says in documentation > > and maybe this is a bug.
> > I do not read than from the docs. I see "data are lines", but that
also Show quoted text
> holds for "text/html" and hundreds of other text/* types. >
In documentation to Mail::Message::Body module says that mime_type option has 'text/plain' value by default, but code without setting it explicitly doesn't work. my $body = Mail::Message::Body->new( data => $text, charset => 'utf-8', ); In your code it sets to default value only when creating message body from file.
Subject: Re: [rt.cpan.org #52278] Misunderstandings with utf-8 encoded header fields and body
Date: Sun, 13 Dec 2009 22:55:01 +0100
To: Dmitry Bigunyak via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Dmitry Bigunyak via RT (bug-Mail-Box@rt.cpan.org) [091203 09:16]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=52278 > > > I set charset to 'utf-8' to tell that phrase in utf-8 encoding. And > here everything is right. > To create utf-8 message body I've found another way > > my $text = "тестовая строка"; # scalar in internal UTF8 perl encoding > my $body = Mail::Message::Body->new( > data => $text, > mime_type => 'text/plain', > charset => 'utf-8', # don't understand why not 'PERL' > ); > $body = $body->encoded;
my $text = "тестовая строка"; # scalar in internal UTF8 perl encoding Encode::_utf8_on($text); my $body = Mail::Message::Body->new( data => $text, mime_type => 'text/plain', charset => 'PERL' ); my $msg = Mail::Message->buildFromBody($body); print $msg->string; The "PERL" type disappears when the body is added to a message... A "message" conforms to the RFCs, where the "body" is internal to the program; can be in non-RFC state. Show quoted text
> Why I should set charset attribute to 'utf-8' instead of 'PERL'? > In documentation to Mail::Message::Body module says that mime_type > option has 'text/plain' value by default, but code without setting it > explicitly doesn't work. > > my $body = Mail::Message::Body->new( > data => $text, > charset => 'utf-8', > );
Wow, impressive crash on Perl 5.10.0 Assertion ((svtype)((_svi)->sv_flags & 0xff)) >= SVt_PV failed: file "Encode.xs", line 248 at ../../lib/Mail/Message/Body/Encode.pm line 205. Perl 5.11 doesn't like your _utf_on either, it seems. Show quoted text
> In your code it sets to default value only when creating message body > from file.
Yes... I think you are right. Change the docs or the implementation? -- MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Show quoted text
> > Perl 5.11 doesn't like your _utf_on either, it seems. >
Yes, this isn't a good practice to use private methods, sorry. I replace all calls _utf_on with decode_utf8 to get the same result. Show quoted text
> > In your code it sets to default value only when creating message
body Show quoted text
> > from file.
> > Yes... I think you are right. Change the docs or the implementation?
Changing the implementation to make it real default value is preferred I think.
text/plain now default as promissed. Original bugreport closed: not a MailBox problem.