Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Spreadsheet-ParseExcel CPAN distribution.

Maintainer(s)' notes

If you are reporting a bug in Spreadsheet::ParseExcel here are some pointers

1) State the issues as clearly and as concisely as possible. A simple program or Excel test file (see below) will often explain the issue better than a lot of text.

2) Provide information on your system, version of perl and module versions. The following program will generate everything that is required. Put this information in your bug report.

    #!/usr/bin/perl -w

    print "\n    Perl version   : $]";
    print "\n    OS name        : $^O";
    print "\n    Module versions: (not all are required)\n";

    my @modules = qw(
                      Spreadsheet::ParseExcel
                      Scalar::Util
                      Unicode::Map
                      Spreadsheet::WriteExcel
                      Parse::RecDescent
                      File::Temp
                      OLE::Storage_Lite
                      IO::Stringy
                    );

    for my $module (@modules) {
        my $version;
        eval "require $module";

        if (not $@) {
            $version = $module->VERSION;
            $version = '(unknown)' if not defined $version;
        }
        else {
            $version = '(not installed)';
        }

        printf "%21s%-24s\t%s\n", "", $module, $version;
    }

    __END__

3) Upgrade to the latest version of Spreadsheet::ParseExcel (or at least test on a system with an upgraded version). The issue you are reporting may already have been fixed.

4) Create a small example program that demonstrates your problem. The program should be as small as possible. A few lines of codes are worth tens of lines of text when trying to describe a bug.

5) Supply an Excel file that demonstrates the problem. This is very important. If the file is big, or contains confidential information, try to reduce it down to the smallest Excel file that represents the issue. If you don't wish to post a file here then send it to me directly: jmcnamara@cpan.org

6) Say if the test file was created by Excel, OpenOffice, Gnumeric or something else. Say which version of that application you used.

7) If you are submitting a patch you should check with the maintainer whether the issue has already been patched or if a fix is in the works. Patches should be accompanied by test cases.

Asking a question

If you would like to ask a more general question there is the Spreadsheet::ParseExcel Google Group.

Report information
The Basics
Id: 41337
Status: resolved
Worked: 2 hours (120 min)
Priority: 0/
Queue: Spreadsheet-ParseExcel

People
Owner: Nobody in particular
Requestors: blueboy.geo [...] yahoo.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Spreadsheet::ParseExcel::Utility::xls2csv
Date: Mon, 1 Dec 2008 01:09:39 -0800 (PST)
To: bug-Spreadsheet-ParseExcel [...] rt.cpan.org
From: Fredrik Linde <blueboy.geo [...] yahoo.com>
Hi! I'm using Spreadsheet::ParseExcel::Utility::xls2csv together with Text::CSV_XS getline function. I have discovered that the xls2csv implementation is only dumping the content of a cell between two commas. More appropriate would be to follow a CSV grammar so the extracted data can be used with other modules like Text::CSV_XS. I have quoted the two sources I have used for my implementation, and I have not looked in to the "header = name *(COMMA name)" rule but I think it will work in an sufficient way. Best Regards /Fredrik Linde "CSV is a delimited data format that has fields/columns separated by the comma character and records/rows separated by newlines. Fields that contain a special character (comma, newline, or double quote), must be enclosed in double quotes. However, if a line contains a single entry which is the empty string, it may be enclosed in double quotes. If a field's value contains a double quote character it is escaped by placing another double quote character next to it. The CSV file format does not require a specific character encoding, byte order, or line terminator format. * Each record is one line terminated by a line feed (ASCII/LF=0x0A) or a carriage return and line feed pair (ASCII/CRLF=0x0D 0x0A), however, line-breaks can be embedded. * Fields are separated by commas. * Allowable characters within a CSV field include 0x09 (tab) and the inclusive range of 0x20 (space) through 0x7E (tilde). In binary mode all characters are accepted, at least in quoted fields. * A field within CSV must be surrounded by double-quotes to contain a the separator character (comma)." -http://search.cpan.org/~hmbrand/Text-CSV_XS-0.58/CSV_XS.pm "2. Definition of the CSV Format While there are various specifications and implementations for the CSV format (for ex. [4], [5], [6] and [7]), there is no formal specification in existence, which allows for a wide variety of interpretations of CSV files. This section documents the format that seems to be followed by most implementations: 1. Each record is located on a separate line, delimited by a line break (CRLF). For example: aaa,bbb,ccc CRLF zzz,yyy,xxx CRLF 2. The last record in the file may or may not have an ending line break. For example: aaa,bbb,ccc CRLF zzz,yyy,xxx 3. There maybe an optional header line appearing as the first line of the file with the same format as normal record lines. This header will contain names corresponding to the fields in the file and should contain the same number of fields as the records in the rest of the file (the presence or absence of the header line should be indicated via the optional "header" parameter of this MIME type). For example: field_name,field_name,field_name CRLF aaa,bbb,ccc CRLF zzz,yyy,xxx CRLF Shafranovich Informational [Page 2] RFC 4180 Common Format and MIME Type for CSV Files October 2005 4. Within the header and each record, there may be one or more fields, separated by commas. Each line should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. The last field in the record must not be followed by a comma. For example: aaa,bbb,ccc 5. Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields. For example: "aaa","bbb","ccc" CRLF zzz,yyy,xxx 6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example: "aaa","b CRLF bb","ccc" CRLF zzz,yyy,xxx 7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote. For example: "aaa","b""bb","ccc" " -http://tools.ietf.org/html/rfc4180#section-2

Message body is not shown because sender requested not to inline it.

On Mon Dec 01 04:10:08 2008, blueboy.geo@yahoo.com wrote: Show quoted text
> I'm using Spreadsheet::ParseExcel::Utility::xls2csv together with > Text::CSV_XS getline function.
Hi, For converting from xls to csv I'd recommend using the xls2csv or xlscat programs in Spreadsheet::Read: http://search.cpan.org/src/HMBRAND/Spreadsheet-Read-0.29/examples/xls2csv http://search.cpan.org/src/HMBRAND/Spreadsheet-Read-0.29/examples/xlscat or the following xls2csv: http://search.cpan.org/~ken/xls2csv-1.06/script/xls2csv I'll probably deprecate the Utility function in favour of the above. John. --
Hi, I've added Text::CSV_XS CSV handing to the Utility::xls2csv() in Spreadsheet::ParseExcel version 0.49. However, you should still consider using one of the other xls2csv programs listed in the docs. Thanks for highlighting this issue, John. --