Skip Menu |

This queue is for tickets about the HTTP-Message CPAN distribution.

Report information
The Basics
Id: 77403
Status: resolved
Worked: 1 hour (60 min)
Priority: 0/
Queue: HTTP-Message

People
Owner: Nobody in particular
Requestors: gortan [...] cpan.org
henrik.pauli [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 6.03
Fixed in: (no value)



Subject: Message affected by 'use utf8', breaks binary POSTs
It appeared to us that POSTing binary data with LWP corrupted the data when (and only when) we had ‘use utf8’ enabled in the script using LWP. This bug was present in LWP 5.833 as well as the newest HTTP::Message 6.03. ‘use utf8’ doesn't do anything but turn the strings in the source code into string of characters, rather than octets -- it seems that HTTP::Request::Common is completely encoding (and u-string) agnostic, which is VERY dangerous in a place where you manipulate octet streams. The source of the problem is that you have strings in the source code (eg. where you add the Content-Disposition header[1]), and *also* read bytes from the file into the same buffer later on[2]. One is easily a character string, the other is definitely an octet stream. Not sure what the right solution is, but the module should safeguard itself against these kinds of things. [1] https://metacpan.org/source/GAAS/HTTP-Message-6.03/lib/HTTP/Request/Common.pm#L135 [2] https://metacpan.org/source/GAAS/HTTP-Message-6.03/lib/HTTP/Request/Common.pm#L243 P.S. Might be a similar issue, we also recently noticed that https and use utf8 breaks a HTTP request, either or both of them missing doesn't. PPS. Perl 5.10.1, Linux 3.1 x86.
It would be helpful if you can provide a small test script that demonstrates the problem.
On Sun May 27 07:48:49 2012, GAAS wrote: Show quoted text
> It would be helpful if you can provide a small test script that demonstrates the problem.
I think I just ran into the same issue, and tried to come up with two minimal scripts: Both have a constant value 'öööö' in their source code, which they both pass on to HTTP::Request::Common::POST to print them as application/x-www-form-urlencoded. One of the scripts is saved as latin-1, the other is saved as utf-8 and has "use utf8" set. I would assume that the output of both scripts is identical. However, while the latin1 script produces the expected: text=%F6%F6%F6%F6%F6%F6%F6%F6%F6%F6%F6 the utf8 script (imho incorrectly) produces: text=%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6 $HTTP::Request::Common::VERSION is 6.04, perl v5.20.2 built for x86_64-linux.
Subject: test-latin1.pl
#! /usr/bin/env perl use Encode qw( encode ); use HTTP::Request::Common qw( POST ); use warnings; use strict; my $s = 'ööööööööööö'; my $req = POST( 'http://localhost:8080/', { text => $s, } ); print( $req->as_string() );
Subject: test-utf8.pl
#! /usr/bin/env perl use utf8; use Encode qw( encode ); use HTTP::Request::Common qw( POST ); use warnings; use strict; my $s = 'ööööööööööö'; my $req = POST( 'http://localhost:8080/', { text => $s, } ); print( $req->as_string() );