Skip Menu |

This queue is for tickets about the Text-CSV_XS CPAN distribution.

Report information
The Basics
Id: 31491
Status: rejected
Priority: 0/
Queue: Text-CSV_XS

People
Owner: HMBRAND [...] cpan.org
Requestors: QIANGLI [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.32
Fixed in: (no value)



Subject: broken with field that started with quoted char that followed by some other chars
hi, it seems that Text::CSV_XS is broken when handling field that started with quoted char that followed by some other chars. for example: (tilda delimited lines) "a"a~bla # doesn't parse "a" a~bla # doesn't parse a"a"a~bla # parsed a "a" a~bla # parsed i have the following set sep_char => '~', allow_whitespace => 1, always_quote => 1, escape_char => '"', allow_loose_quotes => 1, binary => 1,
Subject: Re: [rt.cpan.org #31491] broken with field that started with quoted char that followed by some other chars
Date: Thu, 13 Dec 2007 18:07:26 +0100
To: bug-Text-CSV_XS [...] rt.cpan.org
From: "H.Merijn Brand" <h.m.brand [...] xs4all.nl>
On Thu, 13 Dec 2007 11:45:12 -0500, "Qiang Li via RT" <bug-Text-CSV_XS@rt.cpan.org> wrote: Show quoted text
> > Thu Dec 13 11:45:11 2007: Request 31491 was acted upon. > Transaction: Ticket created by QIANGLI > Queue: Text-CSV_XS > Subject: broken with field that started with quoted char that followed by > some other chars > Broken in: 0.32 > Severity: Normal > Owner: Nobody > Requestors: QIANGLI@cpan.org > Status: new > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=31491 > > > > hi, > > it seems that Text::CSV_XS is broken when handling field that started > with quoted char that followed by some other chars. for example: > > (tilda delimited lines) > > "a"a~bla # doesn't parse > "a" a~bla # doesn't parse > a"a"a~bla # parsed > a "a" a~bla # parsed > > i have the following set >
and where exactly did you alter the default quote character? Show quoted text
> sep_char => '~', > allow_whitespace => 1, > always_quote => 1, > escape_char => '"', > allow_loose_quotes => 1, > binary => 1,
quote_char => undef, quote_char => "'", quote_char => "\0", -- H.Merijn Brand Amsterdam Perl Mongers (http://amsterdam.pm.org/) using & porting perl 5.6.2, 5.8.x, 5.10.x on HP-UX 10.20, 11.00, 11.11, & 11.23, SuSE 10.1 & 10.2, AIX 5.2, and Cygwin. http://qa.perl.org http://mirrors.develooper.com/hpux/ http://www.test-smoke.org http://www.goldmark.org/jeff/stupid-disclaimers/
/home/merijn 109 > perl xx.pl <1>, <2>, <3>, <4> <1>, <"s>, <3"s>, <"a"b> nb09:/home/merijn 110 > cat xx.pl #!/pro/bin/perl use strict; use warnings; use IO::Handle; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ escape_char => "\\", quote_char => undef, binary => 1, eol => ">\n", }); local $" = ">, <"; while (my $f = $csv->getline (*DATA)) { print "<@{$f}>\n"; } __END__ 1,2,3,4 1,"s,3"s,"a"b /home/merijn 111 > perl xx.pl <1>, <2>, <3>, <4> <1>, <"s>, <3"s>, <"a"b> /home/merijn 112 >
Subject: Re: [rt.cpan.org #31491] broken with field that started with quoted char that followed by some other chars
Date: Fri, 14 Dec 2007 10:48:58 -0500
To: bug-Text-CSV_XS [...] rt.cpan.org
From: "Qiang (James) Li" <shijialee [...] gmail.com>
thanks for the prompt reply. see my reply below.. On Dec 13, 2007 12:08 PM, h.m.brand@xs4all.nl via RT <bug-Text-CSV_XS@rt.cpan.org> wrote: Show quoted text
> > <URL: http://rt.cpan.org/Ticket/Display.html?id=31491 > > > On Thu, 13 Dec 2007 11:45:12 -0500, "Qiang Li via RT" > <bug-Text-CSV_XS@rt.cpan.org> wrote: >
> > > > Thu Dec 13 11:45:11 2007: Request 31491 was acted upon. > > Transaction: Ticket created by QIANGLI > > Queue: Text-CSV_XS > > Subject: broken with field that started with quoted char that followed by > > some other chars > > Broken in: 0.32 > > Severity: Normal > > Owner: Nobody > > Requestors: QIANGLI@cpan.org > > Status: new > > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=31491 > > > > > > > hi, > > > > it seems that Text::CSV_XS is broken when handling field that started > > with quoted char that followed by some other chars. for example: > > > > (tilda delimited lines) > > > > "a"a~bla # doesn't parse > > "a" a~bla # doesn't parse > > a"a"a~bla # parsed > > a "a" a~bla # parsed > > > > i have the following set > >
> and where exactly did you alter the default quote character? >
> > sep_char => '~', > > allow_whitespace => 1, > > always_quote => 1, > > escape_char => '"', > > allow_loose_quotes => 1, > > binary => 1,
> > quote_char => undef, > quote_char => "'", > quote_char => "\0", >
the purpose of my script is to convert a file to a excel friendly format file. since there may have comma inside the field, i need to double quote the field. as to your question: do i need to specify the quote_char? the doc says The char used for quoting fields containing blanks, by default the double quote character ("). so the quote char is double quote if i don't specify it.. no ? also is the doc saying "fields containing ONLY blanks"? it isn't very clear to me, and i don't understand why you are using escape_char => "\\" ? i am using escape_char => '"' and hoping to escape the double quote inside a field. and your example script doesn't handle 1,"s,3"s,"a"b correctly. as i expect "s,3"s is one field but your script split on the inside comma. i will find some time to play with Text::CSV_XS. thanks for the help. Qiang
Subject: Re: [rt.cpan.org #31491] broken with field that started with quoted char that followed by some other chars
Date: Fri, 14 Dec 2007 16:58:56 +0100
To: bug-Text-CSV_XS [...] rt.cpan.org
From: "H.Merijn Brand" <h.m.brand [...] xs4all.nl>
On Fri, 14 Dec 2007 10:49:50 -0500, "Qiang (James) Li via RT" <bug-Text-CSV_XS@rt.cpan.org> wrote: Show quoted text
> > Queue: Text-CSV_XS > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=31491 > > > thanks for the prompt reply. see my reply below.. > > On Dec 13, 2007 12:08 PM, h.m.brand@xs4all.nl via RT > <bug-Text-CSV_XS@rt.cpan.org> wrote:
> > > > <URL: http://rt.cpan.org/Ticket/Display.html?id=31491 > > > > > On Thu, 13 Dec 2007 11:45:12 -0500, "Qiang Li via RT" > > <bug-Text-CSV_XS@rt.cpan.org> wrote: > >
> > > > > > Thu Dec 13 11:45:11 2007: Request 31491 was acted upon. > > > Transaction: Ticket created by QIANGLI > > > Queue: Text-CSV_XS > > > Subject: broken with field that started with quoted char that followed by > > > some other chars > > > Broken in: 0.32 > > > Severity: Normal > > > Owner: Nobody > > > Requestors: QIANGLI@cpan.org > > > Status: new > > > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=31491 > > > > > > > > > > hi, > > > > > > it seems that Text::CSV_XS is broken when handling field that started > > > with quoted char that followed by some other chars. for example: > > > > > > (tilda delimited lines) > > > > > > "a"a~bla # doesn't parse > > > "a" a~bla # doesn't parse > > > a"a"a~bla # parsed > > > a "a" a~bla # parsed > > > > > > i have the following set > > >
> > and where exactly did you alter the default quote character? > >
> > > sep_char => '~', > > > allow_whitespace => 1, > > > always_quote => 1, > > > escape_char => '"', > > > allow_loose_quotes => 1, > > > binary => 1,
> > > > quote_char => undef, > > quote_char => "'", > > quote_char => "\0", > >
> > the purpose of my script is to convert a file to a excel friendly > format file.
Look in examples to see my stab at that: csv2xls That creates the excel file for you at once. Show quoted text
> since there may have comma inside the field, i need to > double quote the field. as to your question: > > do i need to specify the quote_char?
only if you want it to be something different than the default double-quote, which happens to be the CSV standard quoting character. Show quoted text
> the doc says > > The char used for quoting fields containing blanks, by default the > double quote character ("). > > so the quote char is double quote if i don't specify it.. no ?
Yes Show quoted text
> also is the doc saying "fields containing ONLY blanks"? it isn't very > clear to me,
That is still open to international discussion. In Text::CSV_XS there is currently no way to see the difference between 1,,2 and 1,"",2 For your question, the line 1, ,2 is illegal, as fields that contain any special character should be quoted, so 1," ",2 Show quoted text
> and i don't understand why you are using escape_char => "\\" ? i am > using escape_char => '"' and hoping to escape the double quote inside > a field.
It is not a wise idea to leave the escape character to " and undef the quote character, because it will lead to confusion. If you leave it, 1,123"2,3 will cause an error, because the 2 after the " will be seen as an escaped character, and as it is not special, it should not be escaped Show quoted text
> and your example script doesn't handle 1,"s,3"s,"a"b correctly. as i > expect "s,3"s is one field but your script split on the inside comma. > > i will find some time to play with Text::CSV_XS. thanks for the help.
-- H.Merijn Brand Amsterdam Perl Mongers (http://amsterdam.pm.org/) using & porting perl 5.6.2, 5.8.x, 5.10.x on HP-UX 10.20, 11.00, 11.11, & 11.23, SuSE 10.1 & 10.2, AIX 5.2, and Cygwin. http://qa.perl.org http://mirrors.develooper.com/hpux/ http://www.test-smoke.org http://www.goldmark.org/jeff/stupid-disclaimers/