Skip Menu |

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 34841
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: jjn1056 [...] yahoo.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 5.810
Fixed in: (no value)



Subject: HTTP::Message gives utf8 error where it didn't complain in previous version
After upgrading to the latest version of LWP we are getting the following errors all over the place: HTTP::Message content not bytes reviewing the source there appears to be some new code which attempts to test content against an is_utf8 function and in this case is reporting utf8 content where before was passed and the code functioned. In one case the error is generated via SOAP::Lite and in another case via LWP::UserAgent. We'd like to help figure this out, but there isn't a lot of explanation for this change or what problem it was trying to fix. I have to imagine this would cause trouble for other SOAP::Lite users. john.napiorkowski@takkle.com
The reason for the change was to try to enforce the invariant that have been documented for a long time. I want the lower level protocol modules to be able to rely on the content to be bytes; and I really hate all the 'use bytes; length' hacks people are suggesting. An alternative could be to try utf8::downgrade() on the strings passed in and only complain if this fails. Yet another alternative is to not complain about pure 7-bit ASCII that happens to arrive with the UTF8 flag set.
Suggested patch that use utf8::downgrade instead.
commit 0af0d83db926e1370b0c2f51b4002bf9162b113c Author: Gisle Aas <gisle@aas.no> Date: Sat Apr 12 10:50:15 2008 +0200 Allow content that can be downgraded to bytes. Refusing all utf8 flagged strings as HTTP::Message content is probably too strict. diff --git a/lib/HTTP/Message.pm b/lib/HTTP/Message.pm index ea28e29..47ab6a4 100644 --- a/lib/HTTP/Message.pm +++ b/lib/HTTP/Message.pm @@ -11,7 +11,14 @@ my $CRLF = "\015\012"; # "\r\n" is not portable $HTTP::URI_CLASS ||= $ENV{PERL_HTTP_URI_CLASS} || "URI"; eval "require $HTTP::URI_CLASS"; die $@ if $@; -*_is_utf8 = defined &utf8::is_utf8 ? \&utf8::is_utf8 : sub { 0 }; +*_utf8_downgrade = defined(&utf8::downgrade) ? + sub { + utf8::downgrade($_[0], 1) or + Carp::croak("HTTP::Message content not bytes") + } + : + sub { + }; sub new { @@ -29,9 +36,7 @@ sub new $header = HTTP::Headers->new; } if (defined $content) { - if (_is_utf8($content)) { - Carp::croak("HTTP::Message content not bytes"); - } + _utf8_downgrade($content); } else { $content = ''; @@ -110,9 +115,7 @@ sub content { sub _set_content { my $self = $_[0]; - if (_is_utf8($_[1])) { - Carp::croak("HTTP::Message content not bytes") - } + _utf8_downgrade($_[1]); if (!ref($_[1]) && ref($self->{_content}) eq "SCALAR") { ${$self->{_content}} = $_[1]; } @@ -132,9 +135,7 @@ sub add_content my $chunkref = \$_[0]; $chunkref = $$chunkref if ref($$chunkref); # legacy - if (_is_utf8($$chunkref)) { - Carp::croak("HTTP::Message added content not bytes"); - } + _utf8_downgrade($$chunkref); my $ref = ref($self->{_content}); if (!$ref) {
I've now applied this patch and uploaded LWP-5.811 to CPAN.