Subject: | Scrambled special characters from ISO-8859-1 pages in LWP 6.03 while 5.837 was ok |
Date: | Fri, 11 Nov 2011 17:49:32 +0100 |
To: | bug-libwww-perl [...] rt.cpan.org |
From: | André Lang <sierra [...] webrausch.de> |
Issue: Special characters are garbled when loading ISO-8859-1 encoded
pages. I encountered this problem using the decoded_content function of
WWW:Mechanize and tracked it back to LWP.
When running on LWP 5.837, the sample code below returns the proper
french page containing all the extended characters like "é" (french
accented e) and à (accented a).
On LWP 6.03 these characters are lost.
--------------
#!/usr/bin/perl -w
use strict;
use utf8;
use encoding "utf8";
use LWP::UserAgent;
# set proper terminal encoding to display special characters the proper way
my $TERMINAL_CHARSET = ':encoding(utf8)';
if($^O eq 'MSWin32') {
$TERMINAL_CHARSET = ':encoding(cp850)';
}
binmode(STDOUT => $TERMINAL_CHARSET);
# get and dump a french page
my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET =>
'http://www.ciao.fr/Au_congelateur_60_2');
my $res = $ua->request($req);
print $res->content;
--------------