Skip Menu |

This queue is for tickets about the HTTP-Message CPAN distribution.

Report information
The Basics
Id: 67432
Status: rejected
Priority: 0/
Queue: HTTP-Message

People
Owner: Nobody in particular
Requestors: rod.taylor [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 6.02
Fixed in: (no value)



Subject: decoded_content
This website is reporting Content-Type: text/html in the http headers but in the HTML it reports '<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> '. FireFox and other browsers take its encoding as UTF8 but the decoded_content seems to use LATIN1. decoded_content throws the following error within the below short script in 6.02 and 5.837. utf8 "\xE5" does not map to Unicode at /usr/local/lib/perl/5.10.1/Encode.pm line 174. ...propagated at /usr/local/share/perl/5.10.1/HTTP/Message.pm line 398. at /tmp/fail.pl line 12 #!/usr/bin/env perl use HTTP::Request; use LWP::UserAgent; my $request = HTTP::Request->new( 'GET' => 'http://www.condoadvisory.com/item.php?item_id=9113' ); my $ua = LWP::UserAgent->new; $response = $ua->request($request); my $content = $response->decoded_content( raise_error => 1, charset_strict => 1, ); Thanks!
I'm not able to reproduce any error with the given example script. From your description of the error you saw it looks like it's recognizing that the encoding should be UTF-8, but the file actually contains Latin-1 stuff.