Skip Menu |

This queue is for tickets about the Catalyst-Runtime CPAN distribution.

Report information
The Basics
Id: 61033
Status: rejected
Priority: 0/
Queue: Catalyst-Runtime

People
Owner: bobtfish [...] bobtfish.net
Requestors: rg [...] progtech.net
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 5.80027
Fixed in: (no value)



Subject: [PATCH] content-length is reported in characters but should be bytes
When the response body value is a utf8 string that contains multi-byte characters (like umlauts), the content-length is set incorrectly, because the regular length() function returns characters, not bytes. As a result, pages appear short or at least missing some close tags. The attached patch replaces it with the one from the bytes pragma, which fixed the problem for me.
Subject: catalyst.patch
--- Catalyst.pm 2010-09-03 18:59:34.000000000 +0200 +++ site_perl/5.10.1/Catalyst.pm 2010-09-03 18:48:59.000000000 +0200 @@ -29,6 +29,7 @@ use Tree::Simple::Visitor::FindByUID; use Class::C3::Adopt::NEXT; use List::MoreUtils qw/uniq/; use attributes; +use bytes; use utf8; use Carp qw/croak carp shortmess/; @@ -1833,7 +1834,7 @@ sub finalize_headers { } else { # everything should be bytes at this point, but just in case - $response->content_length( length( $response->body ) ); + $response->content_length( bytes::length( $response->body ) ); } }
From: rg [...] progtech.net
On Fri Sep 03 13:11:41 2010, rg@progtech.net wrote: Actually, I think I misunderstood the pragma. I guess it should be only within the block, so it doesn't affect anything else and then I don't have to touch the call to length(). Corrected patch is attached (unfortunately I can't delete the other one).
Subject: catalyst.patch
--- Catalyst.pm 2010-09-03 18:59:34.000000000 +0200 +++ /usr/dist/local/lib/perl5/site_perl/5.10.1/Catalyst.pm 2010-09-03 19:22:13.000000000 +0200 @@ -1833,6 +1833,7 @@ sub finalize_headers { } else { # everything should be bytes at this point, but just in case + use bytes; $response->content_length( length( $response->body ) ); } }
On Fri Sep 03 13:11:41 2010, rg@progtech.net wrote: Show quoted text
> When the response body value is a utf8 string that contains multi-byte > characters (like umlauts), the content-length is set incorrectly, > because the regular length() function returns characters, not bytes. As > a result, pages appear short or at least missing some close tags.
You're not using an encoding plugin (like Catalyst::Plugin::Unicode::Encoding), and therefore your output is broken as it's not encoded. If the content was correctly encoded for output, this wouldn't be a problem. Forcing the mode to bytes is never the correct solution, sorry.