Subject: | Warnings when filehandle with utf8 layer is used |
If look() is used with a filehandle with a utf8 layer, and the file has actually codepoints >= 128, then it's likely that warnings in the form of
# utf8 "\xBC" does not map to Unicode at /usr/share/perl/5.10/Search/Dict.pm line 76, <$fh> line 2.
are generated. See the attached test file for an example.
The reason for this problem: when doing the seek() it can happen that the file pointer ends up in the middle of the UTF-8 sequence, causing the (mandatory?) warning.
Regards,
Slaven
Subject: | search-dict-utf8.t |
#!/usr/bin/perl
use strict;
use File::Temp 'tempfile';
use Search::Dict;
use Test::More 'no_plan';
my @warnings;
$SIG{__WARN__} = sub { push @warnings, @_ };
my $encoding = 'utf8';
#my $encoding = 'iso-8859-1';
my($tmpfh,$tmpfile) = tempfile(UNLINK => 1);
binmode $tmpfh, ":encoding($encoding)";
for (qw(abc def ghi jkl mno pqr stu vwx yz)) {
print $tmpfh $_ . ("\x{fc}"x4096) . "\n";
}
close $tmpfh or die $!;
open my $fh, "<:encoding($encoding)", $tmpfile or die $!;
look $fh, 'vwx';
like scalar(<$fh>), qr{^vwx};
is_deeply join("\n",@warnings), "", "no warnings";
__END__