Skip Menu |

This queue is for tickets about the MIME-Charset CPAN distribution.

Report information
The Basics
Id: 48826
Status: resolved
Priority: 0/
Queue: MIME-Charset

People
Owner: Nobody in particular
Requestors: hanne.moa [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)

Attachments


Subject: Invalid charsets used in the wild crashes MIME-Charset
The charsets "238" (outdated alias for "windows-1250") and "iso-8859-8-i" (I assume it is a corrupt form of "iso-8859-8") have been discovered in the wild, in emails. MIME::Charset cannot currently handle them in any sensible way. The error-message is as follows: Can't call method "encode" on an undefined value at /usr/share/perl5/MIME/Charset.pm line 629. I've checked, they're not in http://www.iana.org/assignments/character-sets , so definitely invalid, evil and annoying. You could add aliases for these two of course but there are no doubt more such weirdo charsets out there...
Hello Moa. I added support for iso-8859-8-[ei]. Would you please check new release, 1.008? On 水曜日 8月 19 05:05:11 2009, moa wrote: Show quoted text
> The charsets "238" (outdated alias for "windows-1250") and > "iso-8859-8-i" (I assume it is a corrupt form of "iso-8859-8") have been > discovered in the wild, in emails. MIME::Charset cannot currently handle > them in any sensible way. > > The error-message is as follows: > Can't call method "encode" on an undefined value at > /usr/share/perl5/MIME/Charset.pm line 629. > > I've checked, they're not in > http://www.iana.org/assignments/character-sets , so definitely invalid, > evil and annoying. > > You could add aliases for these two of course but there are no doubt > more such weirdo charsets out there...
Subject: Re: [rt.cpan.org #48826] Invalid charsets used in the wild crashes MIME-Charset
Date: Tue, 20 Oct 2009 08:26:23 +0200
To: bug-MIME-Charset [...] rt.cpan.org
From: Hanne Moa <hanne.moa [...] gmail.com>
On Tue, Oct 20, 2009 at 01:08, Hatuka*nezumi - IKEDA Soji via RT <bug-MIME-Charset@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=48826 > > > Hello Moa. > > I added support for iso-8859-8-[ei].  Would you please check > new release, 1.008?
Is it an alias or have you changed the error-handling? Never mind, I'll check the code :) Since the sympa-ticket was first filed we've had three more incidents but unfortunately the offending messages were not saved (didn't happen on my watch) so I don't know which... "charsets"... was involved. HM
Subject: Re: [rt.cpan.org #48826] Invalid charsets used in the wild crashes MIME-Charset
Date: Wed, 21 Oct 2009 16:06:11 +0200
To: bug-MIME-Charset [...] rt.cpan.org
From: Hanne Moa <hanne.moa [...] gmail.com>
On Tue, Oct 20, 2009 at 01:08, Hatuka*nezumi - IKEDA Soji via RT <bug-MIME-Charset@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=48826 > > > Hello Moa. > > I added support for iso-8859-8-[ei].  Would you please check > new release, 1.008?
Lucky, today a new iso-8859-8-i dumped in. After upgrading MIME::Charset it passes. *yippee* You might want to add all the aliases in http://www.iana.org/assignments/character-sets while you're at it. Do you have plans for how to handle weirdos like "238" (*not* legal!) in MIME::Charset? HM
On 水曜日 10月 21 10:06:55 2009, moa wrote: Show quoted text
> On Tue, Oct 20, 2009 at 01:08, Hatuka*nezumi - IKEDA Soji via RT > <bug-MIME-Charset@rt.cpan.org> wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=48826 > > > > > Hello Moa. > > > > I added support for iso-8859-8-[ei].  Would you please check > > new release, 1.008?
> > Lucky, today a new iso-8859-8-i dumped in. After upgrading > MIME::Charset it passes. *yippee* > > You might want to add all the aliases in > http://www.iana.org/assignments/character-sets while you're at it. > > Do you have plans for how to handle weirdos like "238" (*not* legal!) > in MIME::Charset?
Either conversion by particular charset is supported or not is essentially depends on either the Encode module (or its submodule) supports it or not. MIME::Charset module just can complement some charset name aliases lacking on Encode, eg. iso-8859-8-i, and so on. So, not all IANA-registered character sets will be supported. Also, some mappings such as Windoze font encoding "238" won't be supported. Handling of unsupported charsets rely on each application. Suitable solutions are vary: Some may force fallback charset; some may bypass encoding/decoding; some may simply abort processing. # For example on Sympa mailing list server, I sent a patch for # second solution above. Regards, --- nezumi
Two more specimens found in the wild, the newest now also with bullsh**t in the From:-address.
Download admin-gruppe@uninett.no.1257944569.2643
application/octet-stream 284.4k

Message body not shown because it is not plain text.

Download admin-gruppe@uninett.no.1253658492.26943
application/octet-stream 113.1k

Message body not shown because it is not plain text.