Subject: | Input data should be utf8::downgrad'ed |
====
use strict;
use warnings;
use Net::SSLeay;
use Data::Dumper;
use Digest::MD5 qw/md5_hex/;
use Digest::SHA qw/sha256_hex/;
use MIME::Base64;
my $msg= "\x80Hello";
utf8::upgrade($msg) if $ARGV[0];
print md5_hex($msg), "\n";
utf8::upgrade($msg) if $ARGV[0];
print sha256_hex($msg), "\n";
utf8::upgrade($msg) if $ARGV[0];
print encode_base64($msg), "\n";
utf8::upgrade($msg) if $ARGV[0];
my ($page, $response, %reply_headers)
= Net::SSLeay::post_https(
'posttestserver.com', # see http://www.posttestserver.com/
443,
'/post.php',
Net::SSLeay::make_headers(
'Content-Type' => 'text/xml; charset:utf-8',
),
$msg
);
print Dumper $page;
===
===
perl test.pl
dea59e28356a94aeca613cc57dee3f68
d26e7eab001349185c929fc5a009e2ae7a5177aef4984ab83d357d419e305360
gEhlbGxv
$VAR1 = 'Successfully dumped 0 post variables.
View it at http://www.posttestserver.com/data/2014/08/27/14.32.221601594009
Post body was 6 chars long.';
===
===
perl test.pl 1
dea59e28356a94aeca613cc57dee3f68
d26e7eab001349185c929fc5a009e2ae7a5177aef4984ab83d357d419e305360
gEhlbGxv
$VAR1 = 'Successfully dumped 0 post variables.
View it at http://www.posttestserver.com/data/2014/08/27/14.32.4362975749
Post body was 7 chars long.';
===
In this example binary data which was utf8::upgrad'ed (UTF-8 flag on) behaves differently from utf8::downgraded one (see different length of body)
there was a similar bug report here https://rt.cpan.org/Ticket/Display.html?id=81668
you closed the bug and documented the issue. but my point is that according to perl documentation downgraded and upgraded data should behave same way, and this issue is something that should be fixed.
you can see in both example runs above that md5, sha, and base64 of the scalar are always same, no matter if it's upgraded or not.
all perl functions behave similar way with utf-8 flagged strings.
upgraed and downgraded forms of strings are same in perl (they match via "eq").
good article about UTF-8 flag: http://blogs.perl.org/users/aristotle/2011/08/utf8-flag.html
So I propose utf8::downgrade any binary data coming to module (in post_https, ssl_write_all, write_partial and other functions)