Skip Menu |

This queue is for tickets about the DBD-CSV CPAN distribution.

Report information
The Basics
Id: 44583
Status: resolved
Priority: 0/
Queue: DBD-CSV

People
Owner: HMBRAND [...] cpan.org
Requestors: reinpost [...] win.tue.nl
Cc: cristiano.passerini [...] gmail.com
AdminCc:

Bug Information
Severity: Important
Broken in: 0.22
Fixed in: 0.25



CC: cristiano.passerini [...] gmail.com
Subject: DBD::CSV cannot read CSV files with dots on the first line
DBD::CSV interprets the first line of a CSV file it reads as the list of column names. This interpretation fails when the column names contain dots. Code example: my $csvh = DBI->connect('dbi:CSV:') or die "the CSV parser doesn't work (not installed?)\n"; my @colnames; $csvh->{'csv_tables'}->{'input'} = { file => $csvfile, sep_char => ',' }; my $sth = $csvh->prepare('SELECT * FROM input'); $sth->execute; The execute statement will execute SQL::Statement->SELECT, which invokes $self->verify_columns( $data,$eval, $all_cols ); to convert the column names into SQL::Statement::Column objects. It is when creating these objects that names with dots are interpreted as standing for $tablename.$columnname combinations, wrongly in this use case, and the resulting shortened $columnname can no longer be found in a hash, with the result that fetch_hashref and related return column names with a values of undef. It is not 100% clear to me where exactly this problem should be rectified, but probably in DBD::CSV. The problem is worsened by the fact that the documentation is unclear on how to work around the bug: it does mention a facility to avoid interpreting the first row as the column names, but I can't figure out how to use it. This limitation should at least be mentioned. This bug has been reported before, but to a different RT: http://rt.perl.org/rt3//Public/Bug/Display.html?id=39466 and was rejected there for that reason.
Would following mdb-tools' approach wordk for you? -S Sanitize names (replace spaces etc. with underscore) I could add an option to enable otherwise bad header rows by replacing all dubious characters with an underscore, like s{[\x00-\x20'";,/\\]}{_}g for @$row;