Skip Menu |

This queue is for tickets about the Text-CSV_XS CPAN distribution.

Report information
The Basics
Id: 131696
Status: resolved
Priority: 0/
Queue: Text-CSV_XS

People
Owner: Nobody in particular
Requestors: violapiratejunky [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: (no value)
Fixed in: (no value)



Subject: Ability to ignore trailing commas in header?
I often get CSVs that have trailing commas in the header (empty header fields). LibreOffice etc open this fine, but it causes this in Text::CSV_XS: # CSV_XS ERROR: 1012 - INI - the header contains an empty field @ rec 1 pos 0 Is it possible that we could have an option to ignore trailing commas in the header (and rows, if that applies). Right now I just open the file and remove any trailing commas from the first line before parsing it with Text::CSV_XS
As there are (plenty) of options already to do so, I'd need to know how you open the file and what initialization you use to parse the header and the file. A (big) danger in stripping empty header fields is that hashes cannot be filled, as those "trailing" header-less columns quite often *do* contain data that cannot be stored in hash-entries with no label/key See https://github.com/Tux/Text-CSV_XS/blob/master/doc/CSV_XS.md#getline_hr See https://github.com/Tux/Text-CSV_XS/blob/master/doc/CSV_XS.md#callbacks-1 for a bunch of possible solutions
Thanks for pointing me to those solutions! I think using the error callback and ignoring the error code is probably the correct way to handle that. I guess there could be an option to do the callback I would write, but maybe that's too specific to warrant its own option, especially when you point out that often it can be a mistake since the columns really do contain data. On Mon Feb 10 10:34:55 2020, HMBRAND wrote: Show quoted text
> As there are (plenty) of options already to do so, I'd need to know > how you open the file and what initialization you use to parse the > header and the file. > > A (big) danger in stripping empty header fields is that hashes cannot > be filled, as those "trailing" header-less columns quite often *do* > contain data that cannot be stored in hash-entries with no label/key > > See https://github.com/Tux/Text- > CSV_XS/blob/master/doc/CSV_XS.md#getline_hr > > See https://github.com/Tux/Text- > CSV_XS/blob/master/doc/CSV_XS.md#callbacks-1 for a bunch of possible > solutions
Actually, I'm not sure that this works? I have this: my $rows = csv( in => $fh, headers => 'auto', encoding => ':encoding(utf-8)', auto_diag => 1, callbacks => { error => sub { my ($err) = @_; say "I'm IN HERE $err"; if ($err == 1012) { say "Setting diag to 0"; Text::CSV_XS->SetDiag(0); } return; }, }, ); I can see that the code does get in there, but I still get this error and my script stops: INI - the header contains an empty field at /path/to/script.pl On Sun Apr 26 22:57:08 2020, srchulo wrote: Show quoted text
> Thanks for pointing me to those solutions! I think using the error > callback and ignoring the error code is probably the correct way to > handle that. I guess there could be an option to do the callback I > would write, but maybe that's too specific to warrant its own option, > especially when you point out that often it can be a mistake since the > columns really do contain data. > > On Mon Feb 10 10:34:55 2020, HMBRAND wrote:
> > As there are (plenty) of options already to do so, I'd need to know > > how you open the file and what initialization you use to parse the > > header and the file. > > > > A (big) danger in stripping empty header fields is that hashes cannot > > be filled, as those "trailing" header-less columns quite often *do* > > contain data that cannot be stored in hash-entries with no label/key > > > > See https://github.com/Tux/Text- > > CSV_XS/blob/master/doc/CSV_XS.md#getline_hr > > > > See https://github.com/Tux/Text- > > CSV_XS/blob/master/doc/CSV_XS.md#callbacks-1 for a bunch of possible > > solutions
Subject: Re: [rt.cpan.org #131696] Ability to ignore trailing commas in header?
Date: Sat, 23 May 2020 13:43:26 +0200
To: bug-Text-CSV_XS [...] rt.cpan.org
From: "H.Merijn Brand" <h.m.brand [...] xs4all.nl>
On Fri, 22 May 2020 12:52:25 -0400, "Adam Hopkins via RT" <bug-Text-CSV_XS@rt.cpan.org> wrote: Show quoted text
> Actually, I'm not sure that this works? I have this: > > my $rows = csv( > in => $fh, > headers => 'auto', > encoding => ':encoding(utf-8)', > auto_diag => 1, > callbacks => { > error => sub { > my ($err) = @_; > say "I'm IN HERE $err"; > if ($err == 1012) { > say "Setting diag to 0"; > Text::CSV_XS->SetDiag(0); > } > > return; > }, > }, > ); > > I can see that the code does get in there, but I still get this error > and my script stops: > > INI - the header contains an empty field at /path/to/script.pl
This is because the "header" method calls "croak", as on these errors, subsequent behavior is undefined and moving on is more dangerous than trying to continue. Ignoring this error is most likely to cause fail later on. What you want for the header is a munger, as described in the manual, for "db", that would be something like: --8<--- use Data::Peek; use Text::CSV_XS qw( csv ); my $no_col = "nc000"; my $rows = csv ( in => *DATA, encoding => ":encoding(utf-8)", bom => 1, munge => sub { lc (s/\W+/_/gr =~ s/^_+//r) || $no_col++ }, auto_diag => 1, callbacks => { error => sub { my ($err) = @_; say "I'm IN HERE $err"; if ($err == 1012) { say "Setting diag to 0"; Text::CSV_XS->SetDiag (0); } return; }, }, ); DDumper $rows; -->8--- => I'm IN HERE 2012 [ { a => '1', b => '2', c => '3', nc000 => '' } ] -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.31 porting perl5 on HP-UX, AIX, and Linux https://useplaintext.email https://tux.nl http://www.test-smoke.org http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/
Download (untitled)
application/pgp-signature 488b

Message body not shown because it is not plain text.