Skip Menu |

This queue is for tickets about the libnet CPAN distribution.

Report information
The Basics
Id: 24835
Status: resolved
Priority: 0/
Queue: libnet

People
Owner: Nobody in particular
Requestors: SAPER [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.20
Fixed in: (no value)



Subject: Net::Cmd 2.27 (libnet 1.20) incorrectly upgrades everything to UTF-8
Hello Graham, I'm afraid libnet 1.20 has introduced a quite important bug for people using characters outside ASCII: unconditionally calling utf8::encode() on any data passed to datasend() has the side effect to convert everything to UTF-8, even when it's not expected to be. As a result, all the accented characters appear as the usual Unicode junk: "é" becomes "é", "ê" becomes "ê" and the like. Therefore, perfectly valid programs that were sending correct mails will send rubbish as soon as libnet is upgraded to version 1.20. As suggested in ticket#18589, there may be a need to pass additional parameter to libnet modules in order to indicate the encoding (although I'm not sure if one can assume that encoding stays the same through an entire mail or if it can change from one part to another). In the mean time, I'd suggest to remove the call to utf8::encode() from Net::Cmd: --- lib/Net/Cmd.pm.orig 2006-10-27 13:08:07.000000000 +0200 +++ lib/Net/Cmd.pm 2007-02-07 19:27:54.328532000 +0100 @@ -21,8 +21,6 @@ } } -my $doUTF8 = eval { require utf8 }; - $VERSION = "2.27"; @ISA = qw(Exporter); @EXPORT = qw(CMD_INFO CMD_OK CMD_MORE CMD_REJECT CMD_ERROR CMD_PENDING); @@ -395,8 +393,6 @@ my $arr = @_ == 1 && ref($_[0]) ? $_[0] : \@_; my $line = join("" ,@$arr); - utf8::encode($line) if $doUTF8; - return 0 unless defined(fileno($cmd)); my $last_ch = ${*$cmd}{'net_cmd_last_ch'}; Best Regards, -- Close the world, txEn eht nepO.
From: ben [...] cpanel.net
Additionally utf8::encode is not available with Perl 5.6.2, though the utf8 require will succeed. [root@localhost root]# perl -Mutf8 -le 'print $]; utf8::encode("hello");' 5.006002 Undefined subroutine utf8::encode called at -e line 1
From: RGARCIA [...] cpan.org
On Wed Feb 07 13:30:48 2007, SAPER wrote: Show quoted text
> As suggested in ticket#18589, there may be a need to pass additional > parameter to libnet modules in order to indicate the encoding > (although I'm not sure if one can assume that encoding stays the > same through an entire mail or if it can change from one part to > another).
In this case you could probably encode it upstream. Show quoted text
> In the mean time, I'd suggest to remove the call to utf8::encode() > from Net::Cmd:
I've applied this patch to bleadperl as change #30576.
From: oskari.ojala [...] frantic.com
Hello, Confirming the bug, it is affecting us. I also e-mailed Graham about this. On Wed Feb 07 13:30:48 2007, SAPER wrote: Show quoted text
> As suggested in ticket#18589, there may be a need to pass additional > parameter to libnet modules in order to indicate the encoding > (although I'm not sure if one can assume that encoding stays the > same through an entire mail or if it can change from one part to > another).
This is not a good idea, a multi-part MIME message can have each part in a different character set. People shouldn't be passing on strings to Net::CMD anyway, they should be passing octets (variables with the internal utf8 flag off). And as there is the internal perl flag for utf8ness I don't see any reason to pass on encoding as a parameter. Net::CMD could just die if the flag is on if you want to be strict. If you don't then the line utf8::encode($line) if $doUTF8; should/could be replaced with: if ($doUTF8) { # encode to individual utf8 bytes if # $line is a string (in internal UTF-8) utf8::encode($line) if utf8::is_utf8($line); } to fix the bug with latin-1 and to do what people probably expect if they would feed UTF8 lines to it. For reference: http://perldoc.perl.org/utf8.html http://www.perlmonks.org/?displaytype=print;node_id=551676
From: SAPER [...] cpan.org
Hello, Attached is a script that illustrate the bug. Use it like this: $ perl cpan-rt-24835.pl <smtp> <address> where <smtp> is the name of your SMTP server and <address> your email address. When the line 25 is commented, the characters stay as they are and are correctly transmitted as ISO-Latin-1. Uncomment it and the characters (which, if I understand correctly, have already been internally upgraded to utf8 because of the string coming from the XML document) are now sent as raw bytes with libnet version 1.20 and 1.21, while they were correctly sent (probably after a smart/magic downgrade) as Latin1 with previous versions. A solution is to use Encode::encode() to manually downgrade the string coming from XML to Latin-1. But the fact remains that perfectly valid code which was working till libnet-1.20 came out must now do additional work to send correct data. IMHO, I consider this as a bug, or at the very least as an incompatible change that should be documented. Best Regards -- Close the world, txEn eht nepO.
#!/usr/bin/perl use strict; use Net::SMTP; use XML::LibXML; my $server = shift; my $from = shift; my $rcpt = $from; my @text = (); push @text, "Using ", join ", ", map {"$_ ".$_->VERSION} qw(Net::Cmd Net::SMTP); push @text, $/; push @text, q{ Data from Perl string: aacute(á) agrave(à) acirc(â) auml(ä) eacute(é) egrave(è) ecirc(ê) euml(ë) ccedil(ç) ntilda(ñ) eth(ð) thorn(þ) aelig(æ) szlig(ß) oslash(ø) pound(£) yen(¥) laquo(«) raquo(») iexcl(¡) iquest(¿) }, $/; my $xml = XML::LibXML->new->parse_string( q{<?xml version="1.0"?><root><node>Lorem Ipsum</node></root>} ); # comment/uncomment the following line to see the bug push @text, "Data from XML:\n", $xml->findvalue("/root/node"); my $message = Net::SMTP->new($server) or die "can't connect to $server\n"; $message->mail($from) or die "can't set 'from' address\n"; $message->to($rcpt) or die "can't set recipient\n"; $message->data() or die "can't initiate data send\n"; $message->datasend(\@text) or die "can't send data\n"; $message->dataend() or die "can't end data send\n"; $message->quit() or die "can't finalise message\n";
Subject: Re: [rt.cpan.org #24835] Net::Cmd 2.27 (libnet 1.20) incorrectly upgrades everything to UTF-8
Date: Wed, 30 May 2007 13:26:27 -0500 (CDT)
To: bug-libnet [...] rt.cpan.org
From: "Graham Barr" <gbarr [...] pobox.com>
Fixed in libnet 1.21 Graham.
From: SAPER [...] cpan.org
Hello Graham, I just tested libnet 1.22 and this version seems to correctly work (i.e. does not mangle ISO-8859-1 characters). I'm installing it on a production server in order to check that the mails are correctly created. -- Close the world, txEn eht nepO.