Skip Menu |

This queue is for tickets about the Test-WWW-Mechanize-Catalyst CPAN distribution.

Report information
The Basics
Id: 23833
Status: resolved
Priority: 0/
Queue: Test-WWW-Mechanize-Catalyst

People
Owner: Nobody in particular
Requestors: chris+rt [...] chrisdolan.net
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.37
Fixed in: (no value)



Subject: UTF8 content not decoded
I'm not sure if T::W::M::Catalyst is at fault for this... My Catalyst app is emitting UTF-8 HTML with a BOM. Way upstream of T::W::M::C I get this error: Parsing of undecoded UTF-8 will give garbage when decoding entities at /Users/chris/perl/lib/perl5/site_perl/5.8.6/darwin-thread-multi-2level/HTML/PullParser.pm line 83. I have attached a quick-and-dirty patch that resolves the issue for me. I check for a "charset=..." in the Content-Type response header and decode the $response->content. I've only tested this patch on Content-Type: text/html; charset=utf-8 Maybe that check should look at meta-http content-type too if the Content-Type lacks a charset? Or look for an XML declaration? I suspect this patch might break if this header is set: Content-Encoding: gzip but that shouldn't happen under this mocked Catalyst, right?
Subject: twmc.patch
--- /Users/chris/perl/lib/perl5/site_perl/Test/WWW/Mechanize/Catalyst.pm 2006-06-06 01:40:30.000000000 -0500 +++ lib/Test/WWW/Mechanize/Catalyst.pm 2006-12-06 16:05:31.000000000 -0600 @@ -2,6 +2,7 @@ use strict; use warnings; use Test::WWW::Mechanize; +use Encode qw(); use base qw(Test::WWW::Mechanize); our $VERSION = "0.37"; @@ -51,6 +52,12 @@ $end_of_chain->previous($old_response); # ...and add us to it } + if ($response->header('Content-Type') && + $response->header('Content-Type') =~ m/charset=(\S+)/xms) { + my $encoding = $1; + $response->content(Encode::decode($encoding, $response->content())); + } + return $response; }
From: CDOLAN [...] cpan.org
Attached is an updated patch. The previous one was broken for redirects to HTML content -- the content was double-decoded. -- Chris
--- /Users/chris/perl/lib/perl5/site_perl/Test/WWW/Mechanize/Catalyst.pm 2006-06-06 01:40:30.000000000 -0500 +++ lib/Test/WWW/Mechanize/Catalyst.pm 2007-01-17 10:19:40.000000000 -0600 @@ -2,6 +2,7 @@ use strict; use warnings; use Test::WWW::Mechanize; +use Encode qw(); use base qw(Test::WWW::Mechanize); our $VERSION = "0.37"; @@ -50,6 +51,14 @@ } # of the chain... $end_of_chain->previous($old_response); # ...and add us to it } + else { + $response->{_raw_content} = $response->content; + if ($response->header('Content-Type') && + $response->header('Content-Type') =~ m/charset=(\S+)/xms) { + my $encoding = $1; + $response->content(Encode::decode($encoding, $response->content)); + } + } return $response; }
Thanks, this'll be in the next release.