Subject: | tell() returns wrong values when used with ":encoding(UTF-8)" on files with unix newlines (0x0a). |
Date: | Fri, 2 Oct 2020 15:40:28 +0000 |
To: | "bug-Perl-Dist-Strawberry [...] rt.cpan.org" <bug-Perl-Dist-Strawberry [...] rt.cpan.org> |
From: | Sascha Baer <sbaer [...] magix.net> |
When I open a file with ":encoding(UTF-8)" that contains unix newlines (0x0a) tell() returns wrong values.
test.txt in the attachenment-zip contains 6 unix newlines:
$ hexdump.exe test.txt
0000000 0a0a 0a0a 0a0a
0000006
and calling tell() in a readline-loop, see test.pl in the zip, returns -2, -1, 0, 2, 4, 6.
When calling open with ":utf8", tell() returns the correct values 1-6.
":encoding(UTF-8)" does some additional checking, that ":utf8" doesn't do.
Show quoted text
> To mark FILEHANDLE as UTF-8, use :utf8 or :encoding(UTF-8). :utf8 just marks the data as
> UTF-8 without further checking, while :encoding(UTF-8) checks the data for actually being
> valid UTF-8. More details can be found in PerlIO::encoding.
(https://perldoc.perl.org/perlfunc#binmode)
It seems to me that this checking breaks things.
The MSYS2-perl (https://www.msys2.org/) does it also correct.
My Strawberry Perl Version is v5.32, but I think older versions are also affected.
Message body not shown because it is not plain text.