Subject: | Error determining of encoding in ASP.NET web pages |
subroutine decoded_content calls method header in scalar content, so if
site have two content-type headers, decoded_content will see only first.
This is not a bug, but Micro$oft ASP.NET software sends two headers:
Content-Type: text/html
Content=Type: text/html; charset=cp1251
so LWP cannot determine encoding.
I've attached a diff file, which I made to resolve this problem. It
calls header it array context.
And (maybe this is not good, or one more option needed) I've removed
FB_CROAK() from Encode::encode's parameters, because some sites has some
incorrect characters.
Andrey Kostenko, software developer of Siteheart Inc. (http://kostenko.name)
Subject: | 1.diff |
170,177c170,174
< foreach ($self->header("Content-Type")){
< if (my @ct = HTTP::Headers::Util::split_header_words($_)) {
< my %sct_param;
< ($ct, undef, %sct_param) = @{$ct[-1]};
< $ct = lc($ct);
< %ct_param=(%ct_param,%sct_param);
< die "Can't decode multipart content" if $ct =~ m,^multipart/,;
< }
---
> if (my @ct = HTTP::Headers::Util::split_header_words($self->header("Content-Type"))) {
> ($ct, undef, %ct_param) = @{$ct[-1]};
> $ct = lc($ct);
>
> die "Can't decode multipart content" if $ct =~ m,^multipart/,;
178a176
>
180a179
>
257c256
< }
---
> }
273c272
< Encode::LEAVE_SRC());
---
> Encode::FB_CROAK() | Encode::LEAVE_SRC());