Skip Menu |

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 35912
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: jjperss [...] hotmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: error with decoded_content
Date: Thu, 15 May 2008 21:22:19 +0200
To: <bug-libwww-perl [...] rt.cpan.org>
From: Jesper Jørgen Persson <jjperss [...] hotmail.com>
This page: http://www.mariagerfjord.dk/mfk/politik/raad_naevn/aeldreraad/dokumenter/07_01_31/index.html is decoded wrong. The page contains a meta tag with charset utf-8, but decoded_content chooses the fallback encoding: ISO-8859-1. As far as I can debug, the problem lies within HTML::HeadParser. Best Regards Jesper Persson Show quoted text
_________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx
It looks like it's the UTF-8 BOM that confuses it. The decoded_content() should probably also be to look for it, so that it manages to decode the page correctly even when it has no charset parameter for the Content-Type header.
HTML-Parser-3.58 has now been uploaded to CPAN. I think it should fix this issue.