Skip Menu |

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 42396
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: m-uchino [...] yetipapa.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Posted binary-data is broken
Date: Wed, 14 Jan 2009 17:45:46 +0900
To: <bug-libwww-perl [...] rt.cpan.org>
From: "uchino" <m-uchino [...] yetipapa.com>
(Sorry, my English is poor.) Binary-data which posted by LWP::UserAgent with SSL is broken. This problem occurs when using SSL and UTF-8-Flag of added header is set to ON. [EXAMPLE] ---------------------------------------------------------------------- ---- #!/usr/local/bin/perl use strict; use utf8; # ******** (1) use LWP::UserAgent; use HTTP::Request::Common 'POST'; my $ua = LWP::UserAgent->new(); my $http_res = $ua->request(POST 'https://myhost/post.cgi', # ******** (2) Content_Type => 'form-data', Content => [ bin_data => ['./image.gif'], ], Head1 => 'A', # ******** (3) ); $http_res->is_success or die $http_res->message; print "OK\n"; exit; ---------------------------------------------------------------------- ---- [post.cgi CHECK SCRIPT] ---------------------------------------------------------------------- ---- #!/usr/local/bin/perl use strict; my $buffer; binmode STDIN, ':raw'; read(STDIN, $buffer, $ENV{CONTENT_LENGTH}); open my $fh, '>:raw', './request_body.dat' or die($!); print $fh $buffer; close $fh; print "Content-type: text/plain\n\nOK"; exit; ---------------------------------------------------------------------- ---- A header-name is 'Head1'(3). UTF-8-Flag of this text is set to ON by 'use utf8'(1). Post to SSL site(2). Check the file(request_body.dat) by binary-editor. Data and boundary are broken. 'Head1' is ASCII characters, but it's not-quoted and UTF-8-Flag is set to ON. If it's quoted, this problem don't occur. 'Head1' => 'A', # ******** (3) This problem in Crypt::SSLeay? Perl v5.8.8 libwww v5.823 Crypt::SSLeay v0.57
Thanks for your report. Have you verified that this problem goes away if you don't use 'https' (and Crypt-SSLeay)? LWP should probably be more careful if UTF8 encoded strings make it into the header values. If you set the content of a request to a non-downgradable UTF8 string it will croak, but it does not guard the headers. The last mystery here is why perl marks non-quoted strings as UTF8. The following sample code demonstrates: use utf8; use Devel::Peek; Dump([foo => 'bar']); # 'foo' becomes an UTF8 string use Devel::Peek; Dump(['foo' => 'bar']);
From: m-uchino [...] yetipapa.com
Thank you for your reply. Show quoted text
> Have you verified that this problem goes away if you don't use 'https'
(and Crypt-SSLeay)? Yes, I tried post to non-SSL-site(http), then this problem didn't occur. And, I rename file 'Net/SSL.pm' to 'Net/xxxSSL.pm', and I ran the following scripts. ----------------------------------- #!/usr/local/bin/perl use strict; use utf8; use LWP::UserAgent; use HTTP::Request::Common 'POST'; my $ua = LWP::UserAgent->new(); my $http_res = $ua->request(POST 'https://myhost/post.cgi', Content_Type => 'form-data', Content => [ bin_data => ['./image.gif'], ], Head1 => 'A', ); $http_res->is_success or die $http_res->message; print $Net::HTTPS::SSL_SOCKET_CLASS . "\n"; # ********* for check SSL module print "OK\n"; exit; ----------------------------------- I verified that printed 'IO::Socket::SSL'. And, result was the same. I understand that I should not include UTF-8 in a request. The one of the problems that are hard to be found is as follows. ----------------------------------- #!/usr/local/bin/perl use strict; use utf8; use HTML::Form; my $alpha = "\x{3b1}"; # ************ UTF-8 my $html = <<"EOM"; <form method="post" action="./post.cgi" enctype="multipart/form-data"> <input type="text" name="field_1" value="" /> ........ $alpha ....... : : </form> EOM my ($form) = HTML::Form->parse($html, 'https://myhost/') or die 'parse'; my $request = $form->make_request; print 'UTF-8 Flag: ' . (utf8::is_utf8($request->header('Content_Type')) ? 'ON' : 'OFF') . "\n"; exit; ----------------------------------- I spent several days till I find that the cause of the problem is UTF-8... Is the best method to avoid this problem to downgrade all request-headers just before a post?
On Wed Jan 14 13:01:18 2009, m-uchino@yetipapa.com wrote: Show quoted text
> Thank you for your reply. >
> > Have you verified that this problem goes away if you don't use 'https'
> (and Crypt-SSLeay)? > > Yes, I tried post to non-SSL-site(http), then this problem didn't occur.
The data is sent using by calling syswrite() method on the IO::Socket object. Normally this would downgrade the strings to bytes, but apparently this does not happen in the syswrite() implementation of Crypt-SSLeay. I think it's a good idea to make LWP force this before it calls syswrite. The attached patch should address this. Show quoted text
> I understand that I should not include UTF-8 in a request. > The one of the problems that are hard to be found is as follows. > ----------------------------------- > #!/usr/local/bin/perl > > use strict; > use utf8; > > use HTML::Form; > > my $alpha = "\x{3b1}"; # ************ UTF-8 > my $html = <<"EOM"; > <form method="post" action="./post.cgi" enctype="multipart/form-data"> > <input type="text" name="field_1" value="" /> > ........ $alpha ....... > : > : > </form> > EOM > > my ($form) = HTML::Form->parse($html, 'https://myhost/') or die 'parse'; > my $request = $form->make_request; > > print 'UTF-8 Flag: ' . (utf8::is_utf8($request->header('Content_Type')) > ? 'ON' : 'OFF') . "\n"; > > exit; > ----------------------------------- > I spent several days till I find that the cause of the problem is UTF-8... > Is the best method to avoid this problem to downgrade all > request-headers just before a post?
Again, I think that the patch will address the issue in this situation, but we still have issues if the form fields themselves contain wide UTF-8.
From 787516a62fc34caec5950b34f3925950844f34d9 Mon Sep 17 00:00:00 2001 From: Gisle Aas <gisle@aas.no> Date: Wed, 14 Jan 2009 22:09:45 +0100 Subject: [PATCH] Make format_request() ensure that it returns bytes [RT#42396] The method now croaks if passed characters that can't be downgraded to bytes. --- lib/Net/HTTP/Methods.pm | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/lib/Net/HTTP/Methods.pm b/lib/Net/HTTP/Methods.pm index 9704c6c..bb5fad7 100644 --- a/lib/Net/HTTP/Methods.pm +++ b/lib/Net/HTTP/Methods.pm @@ -173,7 +173,13 @@ sub format_request { push(@h2, "Host: $h") if $h; } - return join($CRLF, "$method $uri HTTP/$ver", @h2, @h, "", $content); + my $req = join($CRLF, "$method $uri HTTP/$ver", @h2, @h, "", $content); + return $req unless defined &utf8::downgrade; + unless (utf8::downgrade($req, 1)) { + require Carp; + Carp::croak("Wide character in HTTP request (bytes required)"); + } + return $req; } -- 1.6.1.28.gc32f76
From: m-uchino [...] yetipapa.com
I patched file and I ran script. Then the problem did not happen. Problem was gone! Thank you very much! Show quoted text
> Again, I think that the patch will address the issue in this
situation, but we still have issues if Show quoted text
> the form fields themselves contain wide UTF-8.
I think that programmers are careful to form fields or headers which oneself sets, but item set automatically(such as 'Content-Type' via HTML::Form) is not so. Therefore, many cases will be relieved by your patch. Thank you for your help and your patch and your splendid software.