Subject: | Weird problem (and patch) |
I was trying to understand why filer
(http://blog.perldude.de/archives/category/programming/filer/) was
having problems, and it had to do with how File::MimeInfo was parsing my
home directory under utf8. I am very much not an expert on utf8 - the
enclosed patch was based on a sample from URI::Escape, but it does
appear to work (the problem was that while reading files in my home
directory, the utf8 encoding was breaking and the result was the
infamous "Malformed UTF-8 character (fatal) at
/usr/lib64/perl5/vendor_perl/5.8.8/File/MimeInfo.pm line 120.". The
patch included allows File::MimeInfo to be happy, and therefore filer is
happy, and therefore I am happy :) Hope this makes sense, let me know if
you need anything else,
~mcummings
Subject: | mimeinfo.patch |
--- /usr/lib64/perl5/vendor_perl/5.8.8/File/MimeInfo.pm 2006-07-09 10:57:47.000000000 -0400
+++ /home/mcummings/mimeinfo.pm 2006-07-09 10:59:12.000000000 -0400
@@ -116,8 +116,14 @@ sub default {
{
no warnings; # warnings can be thrown when input is neither ascii or utf8
- $line =~ s/\s//g; # \n and \t are also control chars
- return 'text/plain' unless $line =~ /[\x00-\x1F\xF7]/;
+ if ($] < 5.008) {
+ $line =~ s/([^\0-\x7F])/do {my $o = ord($1); sprintf("%c%c", 0xc0 | ($o >> 6), 0x80 | ($o & 0x3f)) }/ge;
+ }
+ else
+ {
+ utf8::encode($line)
+ }
+ return 'text/plain' unless $line =~ /[\x00-\x1F\xF7]/;
}
print STDERR "> First 10 bytes of the file contain control chars\n" if $DEBUG;
return 'application/octet-stream';