Skip Menu |

This queue is for tickets about the Search-Dict CPAN distribution.

Report information
The Basics
Id: 97188
Status: open
Priority: 0/
Queue: Search-Dict

People
Owner: Nobody in particular
Requestors: SREZIC [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 1.07
Fixed in: (no value)



Subject: Warnings when filehandle with utf8 layer is used
If look() is used with a filehandle with a utf8 layer, and the file has actually codepoints >= 128, then it's likely that warnings in the form of # utf8 "\xBC" does not map to Unicode at /usr/share/perl/5.10/Search/Dict.pm line 76, <$fh> line 2. are generated. See the attached test file for an example. The reason for this problem: when doing the seek() it can happen that the file pointer ends up in the middle of the UTF-8 sequence, causing the (mandatory?) warning. Regards, Slaven
Subject: search-dict-utf8.t
#!/usr/bin/perl use strict; use File::Temp 'tempfile'; use Search::Dict; use Test::More 'no_plan'; my @warnings; $SIG{__WARN__} = sub { push @warnings, @_ }; my $encoding = 'utf8'; #my $encoding = 'iso-8859-1'; my($tmpfh,$tmpfile) = tempfile(UNLINK => 1); binmode $tmpfh, ":encoding($encoding)"; for (qw(abc def ghi jkl mno pqr stu vwx yz)) { print $tmpfh $_ . ("\x{fc}"x4096) . "\n"; } close $tmpfh or die $!; open my $fh, "<:encoding($encoding)", $tmpfile or die $!; look $fh, 'vwx'; like scalar(<$fh>), qr{^vwx}; is_deeply join("\n",@warnings), "", "no warnings"; __END__
On 2014-07-13 09:17:31, SREZIC wrote: Show quoted text
> If look() is used with a filehandle with a utf8 layer, and the file > has actually codepoints >= 128, then it's likely that warnings in the > form of > > # utf8 "\xBC" does not map to Unicode at > /usr/share/perl/5.10/Search/Dict.pm line 76, <$fh> line 2. > > are generated. See the attached test file for an example. > > The reason for this problem: when doing the seek() it can happen that > the file pointer ends up in the middle of the UTF-8 sequence, causing > the (mandatory?) warning. > > Regards, > Slaven
Currently my workaround is to cease these warnings before calling look(): local $SIG{__WARN__} = sub { push @warnings, grep { !/utf8 .* does not map to Unicode/ } @_ }; Search::Dict::look(...)
Great bug! I wonder if temporarily setting the filehandle to "raw" would be a good solution. I hate messing with layers, though. Narrowing the scope of disabling warnings into the look-up might be a less invasive solution.
On 2015-09-06 06:36:28, SREZIC wrote: Show quoted text
> On 2014-07-13 09:17:31, SREZIC wrote:
> > If look() is used with a filehandle with a utf8 layer, and the file > > has actually codepoints >= 128, then it's likely that warnings in the > > form of > > > > # utf8 "\xBC" does not map to Unicode at > > /usr/share/perl/5.10/Search/Dict.pm line 76, <$fh> line 2. > > > > are generated. See the attached test file for an example. > > > > The reason for this problem: when doing the seek() it can happen that > > the file pointer ends up in the middle of the UTF-8 sequence, causing > > the (mandatory?) warning. > > > > Regards, > > Slaven
> > Currently my workaround is to cease these warnings before calling > look(): > > local $SIG{__WARN__} = sub { push @warnings, grep { !/utf8 .* does not > map to Unicode/ } @_ }; > Search::Dict::look(...)
Just for the record: since perl 5.28 the warning message is slightly different ("UTF-8" instead "utf8"), so the workaround looks now: local $SIG{__WARN__} = sub { push @warnings, grep { !/(?:utf8|UTF-8) .* does not map to Unicode/ } @_ }; Search::Dict::look(...)