Bug #87686 for DBD-CSV: No HOW TO section in documentation

Wed Aug 07 17:26:20 2013 ALEENA [...] cpan.org - Ticket created

Subject:

No HOW TO section in documentation

The documentation, as written now, makes the assumption the reader is savvy leaving the unsavvy readers behind. A HOW TO section would help bridge the gap. I have a .csv file with the headings on the first line of the file. Also several fields within the file are spread across multiple lines. Under HOW TO, it would be nice to see: use strict; use warnings; use DBD::CSV; use Data::Dumper; ... # what goes here? print Dumper($array_of_hashes_ref) # or $hash_of_hashes_ref

Thu Aug 08 03:25:05 2013 h.m.brand [...] xs4all.nl - Correspondence added

Subject:	Re: [rt.cpan.org #87686] No HOW TO section in documentation
Date:	Thu, 8 Aug 2013 09:24:49 +0200
To:	bug-DBD-CSV [...] rt.cpan.org
From:	"H.Merijn Brand" <h.m.brand [...] xs4all.nl>

On Wed, 7 Aug 2013 17:26:20 -0400, "Lady Aleena via RT" <bug-DBD-CSV@rt.cpan.org> wrote: Show quoted text

> Wed Aug 07 17:26:20 2013: Request 87686 was acted upon. > Transaction: Ticket created by ALEENA > Queue: DBD-CSV > Subject: No HOW TO section in documentation > Broken in: (no value) > Severity: Normal > Owner: Nobody > Requestors: ALEENA@cpan.org > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=87686 > > > > The documentation, as written now, makes the assumption the reader is savvy > leaving the unsavvy readers behind. A HOW TO section would help bridge > the gap.

Not really. Allow me to disagree on a new section. I admit that some of the CSV handling implies a rather steep learning curve. DBD::CSV is there to take away that curve and offer a DBI interface that adds a SQL interface to CSV files. One of the issues with CSV files is that their format is simple in the definition causing many producers to create CSV that does not directly comply to the basic rules. e.g. just adding "'s to all fields and pass all fields joined by ',' is what many producers interpret as valid CSV say join "," => map { qq{"$_"} } @fields; is bound to break every possible rule about CSV, as fields might contain "'s, ,'s or newlines. Using a correct CSV producer will easy the life of a CSV parser a lot. As many still don't, the two major CSV parsers on CPAN (Text::CSV_XS and Text::CSV, which follows Text::CSV_XS) will have to allow options (attributes) to feature workarounds for bad producers, so bad records like 1,"ok","not"ok",2,"not,ok" can be parsed as the end user expects Show quoted text

> I have a .csv file with the headings on the first line of the file.

Having (correct) headers is an advantage. When the header is a single line, DBD::CSV will automatically pick that up as column names. Show quoted text

> Also several fields within the file are spread across multiple lines.

If you here mean that fields in the data may contain newlines (please do not allow that in the header), DBD::CSV will know how to deal with that as Text::CSV_XS will know how to deal with that by default. Show quoted text

> Under HOW TO, it would be nice to see: > > use strict; > use warnings; > > use DBD::CSV; > use Data::Dumper; > > ... # what goes here? > > print Dumper($array_of_hashes_ref) # or $hash_of_hashes_ref

You don't want that from DBD::CSV, though it *is* possible, but a lot slower than when using Text::CSV_XS directly. Assuming you want to parse/read di.csv in the current folder: # using DBD::CSV use DBI; use Data::Peek; my $dbh = DBI->connect ("dbi:CSV:", undef, undef, { f_ext => ".csv/r", csv_null => 1, RaiseError => 1, PrintError => 1, FetchHashKeyName => "NAME_lc", }) or die $DBI::errstr; my $sth = $dbh->prepare ("select * from di"); $sth->execute; my $aoh; while (my $ref = $sth->fetchrow_hashref) { push @$aoh, $ref; } DDumper ($aoh); # using Text::CSV_XS use Text::CSV_XS; use Data::Peek; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 }); open my $fh, "<", "di.csv" or die "di.csv: $!"; $csv->column_names ($csv->getline ($fh)); my $aoh = $csv->getline_hr_all ($fh); DDumper ($aoh); These should result in the same $aoh (content-wise), but the latter is factors faster: Rate dbi csvxs dbi 141/s -- -98% csvxs 6250/s 4347% -- -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX, AIX, and openSUSE http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/

Thu Aug 08 03:25:06 2013 The RT System itself - Status changed from 'new' to 'open'

Thu Aug 29 03:08:30 2013 HMBRAND [...] cpan.org - Status changed from 'open' to 'resolved'