Skip Menu |

This queue is for tickets about the DBD-CSV CPAN distribution.

Report information
The Basics
Id: 87686
Status: resolved
Priority: 0/
Queue: DBD-CSV

People
Owner: Nobody in particular
Requestors: ALEENA [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)



Subject: No HOW TO section in documentation
The documentation, as written now, makes the assumption the reader is savvy leaving the unsavvy readers behind. A HOW TO section would help bridge the gap. I have a .csv file with the headings on the first line of the file. Also several fields within the file are spread across multiple lines. Under HOW TO, it would be nice to see: use strict; use warnings; use DBD::CSV; use Data::Dumper; ... # what goes here? print Dumper($array_of_hashes_ref) # or $hash_of_hashes_ref
Subject: Re: [rt.cpan.org #87686] No HOW TO section in documentation
Date: Thu, 8 Aug 2013 09:24:49 +0200
To: bug-DBD-CSV [...] rt.cpan.org
From: "H.Merijn Brand" <h.m.brand [...] xs4all.nl>
On Wed, 7 Aug 2013 17:26:20 -0400, "Lady Aleena via RT" <bug-DBD-CSV@rt.cpan.org> wrote: Show quoted text
> Wed Aug 07 17:26:20 2013: Request 87686 was acted upon. > Transaction: Ticket created by ALEENA > Queue: DBD-CSV > Subject: No HOW TO section in documentation > Broken in: (no value) > Severity: Normal > Owner: Nobody > Requestors: ALEENA@cpan.org > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=87686 > > > > The documentation, as written now, makes the assumption the reader is savvy > leaving the unsavvy readers behind. A HOW TO section would help bridge > the gap.
Not really. Allow me to disagree on a new section. I admit that some of the CSV handling implies a rather steep learning curve. DBD::CSV is there to take away that curve and offer a DBI interface that adds a SQL interface to CSV files. One of the issues with CSV files is that their format is simple in the definition causing many producers to create CSV that does not directly comply to the basic rules. e.g. just adding "'s to all fields and pass all fields joined by ',' is what many producers interpret as valid CSV say join "," => map { qq{"$_"} } @fields; is bound to break every possible rule about CSV, as fields might contain "'s, ,'s or newlines. Using a correct CSV producer will easy the life of a CSV parser a lot. As many still don't, the two major CSV parsers on CPAN (Text::CSV_XS and Text::CSV, which follows Text::CSV_XS) will have to allow options (attributes) to feature workarounds for bad producers, so bad records like 1,"ok","not"ok",2,"not,ok" can be parsed as the end user expects Show quoted text
> I have a .csv file with the headings on the first line of the file.
Having (correct) headers is an advantage. When the header is a single line, DBD::CSV will automatically pick that up as column names. Show quoted text
> Also several fields within the file are spread across multiple lines.
If you here mean that fields in the data may contain newlines (please do not allow that in the header), DBD::CSV will know how to deal with that as Text::CSV_XS will know how to deal with that by default. Show quoted text
> Under HOW TO, it would be nice to see: > > use strict; > use warnings; > > use DBD::CSV; > use Data::Dumper; > > ... # what goes here? > > print Dumper($array_of_hashes_ref) # or $hash_of_hashes_ref
You don't want that from DBD::CSV, though it *is* possible, but a lot slower than when using Text::CSV_XS directly. Assuming you want to parse/read di.csv in the current folder: # using DBD::CSV use DBI; use Data::Peek; my $dbh = DBI->connect ("dbi:CSV:", undef, undef, { f_ext => ".csv/r", csv_null => 1, RaiseError => 1, PrintError => 1, FetchHashKeyName => "NAME_lc", }) or die $DBI::errstr; my $sth = $dbh->prepare ("select * from di"); $sth->execute; my $aoh; while (my $ref = $sth->fetchrow_hashref) { push @$aoh, $ref; } DDumper ($aoh); # using Text::CSV_XS use Text::CSV_XS; use Data::Peek; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 }); open my $fh, "<", "di.csv" or die "di.csv: $!"; $csv->column_names ($csv->getline ($fh)); my $aoh = $csv->getline_hr_all ($fh); DDumper ($aoh); These should result in the same $aoh (content-wise), but the latter is factors faster: Rate dbi csvxs dbi 141/s -- -98% csvxs 6250/s 4347% -- -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX, AIX, and openSUSE http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/