Skip Menu |

This queue is for tickets about the Text-CSV_XS CPAN distribution.

Report information
The Basics
Id: 66474
Status: resolved
Priority: 0/
Queue: Text-CSV_XS

People
Owner: Nobody in particular
Requestors: hull [...] snap.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.36
Fixed in: (no value)



Subject: getline does not work with *ARGV & PerlIO
It seems that when using using 'use open ENCODING' and then 'my $r = $csv->getline(*ARGV);' that the PerlIO layer is not set up correctly. In particular, my input file is in UTF-16 format and has "\r\n" line endings. I have attached a test script to this bug. Running "perlio-argv.pl 0" shows that the PerlIO processing is done correctly if I don't use Text::CSV_XS. "perlio-argv.pl 1" demonstrates the bug. Text::CSV_XS generates an internal error about the CR, which I believe is due to the PerlIO translations not being done. "perlio-argv.pl 2" shows that if I split the getline into parse/fields that the error does not occur, and "perlio-argv.pl 3" shows that if I read from "<>" to force the ARGV processing to be done and then call getline that the error does not occur.
Subject: hello2.csv
ÿþhello,world hello,two hello,three
Subject: perlio-argv.pl
#! /usr/bin/perl # echo hello,world >hello.csv; echo hello,two >>hello.csv; echo hello,three >>hello.csv # unix2dos hello.csv # iconv -f UTF-8 -t UTF-16 <hello.csv >hello2.csv use strict; use warnings; use Text::CSV_XS; my $t = $ARGV[0] || '0'; use open (IN => ':encoding(UTF-16) :crlf'); @ARGV = 'hello2.csv'; if ($t eq '0') { print "no Text::CSV_XS\n"; while (<>) { print $_; } } else { my $csv = Text::CSV_XS->new({ binary => 1, eol => $/ }) or die Text::CSV_XS->error_diag; if ($t eq '1') { print "Text::CSV_XS getline\n"; while (my $r = $csv->getline(*ARGV)) { print join(',', @$r)."\n"; } $csv->eof or $csv->error_diag(); } elsif ($t eq '2') { print "Text::CSV_XS parse\n"; while (my $l = <>) { $csv->parse($l) or $csv->error_diag(); my @r = $csv->fields(); print join(',', @r)."\n"; } } else { print "Text::CSV_XS getline, read first line\n"; my $s = <>; while (my $r = $csv->getline(*ARGV)) { print join(',', @$r)."\n"; } $csv->eof or $csv->error_diag(); } }
I'm sorry to have to set the state to "stalled". It should have been "rejected", but I will try to look into alternatives to circumvent the location of the real bug: IO::Handle. The "use open ..." is a pragma, that is not "seen" by IO::Handle the moment it is needed. I think it boils down to something like this: $ dump rt66474.csv [DUMP 0.6.01] 00000000 FF FE 68 00 65 00 6C 00 6C 00 6F 00 2C 00 77 00 ..h.e.l.l.o.,.w. 00000010 6F 00 72 00 6C 00 64 00 0D 00 0A 00 68 00 65 00 o.r.l.d.....h.e. 00000020 6C 00 6C 00 6F 00 2C 00 74 00 77 00 6F 00 0D 00 l.l.o.,.t.w.o... 00000030 0A 00 68 00 65 00 6C 00 6C 00 6F 00 2C 00 74 00 ..h.e.l.l.o.,.t. 00000040 68 00 72 00 65 00 65 00 0D 00 0A 00 h.r.e.e..... $ cat perlio.pl #!/pro/bin/perl use strict; use warnings; my $t = shift || 0; use open IN => ":encoding(UTF-16) :crlf"; @ARGV = "rt66474.csv"; if ($t == 0) { print <>; exit 0; } if ($t == 1) { require IO::Handle; while (my $r = *ARGV->getline) { print $r; } exit 0; } $ perl perlio.pl 0 hello,world hello,two hello,three $ perl perlio.pl 1 UTF-16:Unrecognised BOM 7061 at /pro/lib/perl5/5.12.2/i686-linux-64int- ld/IO/Handle.pm line 1. Compilation failed in require at perlio.pl line 17. Exit 255 $ To use the words of another perl porter: I don't think this is a bug really. use open would be lexically scoped, and ARGV isn't being opened in the scope of the use open pragma, it is being opened in the scope of the the getline () sub, or in other words it is being called in the scope of IO::Handle. I think we document that "<>" is magic anyway, and that its behaviour is a bit special. So while (my $r = *ARGV->getline) { print $r; } is not at all the same thing as while (<>) { print $_; } Rather you want: while (@ARGV) { $ARGV = shift @ARGV; open ARGV or die "Failed to open $ARGV: $!"; while (my $r= *ARGV->getline) { push @lines,$line; } }
On Sun Mar 13 06:54:11 2011, HMBRAND wrote: Show quoted text
> I'm sorry to have to set the state to "stalled". It should have been > "rejected", but I will try to look into alternatives to circumvent the > location of the real bug: IO::Handle.
See the IO patch at <https://rt.perl.org/rt3/Ticket/Display.html?id=92728#txn-965894>. It still needs a little work, but it works in 5.10+.
This is now fixed in bleadperl by commit 986a805. I don’t know when the next separate IO release will be, though.