This queue is for tickets about the Spreadsheet-ParseExcel CPAN distribution.
Maintainer(s)' notes
If you are reporting a bug in Spreadsheet::ParseExcel here are some pointers
1) State the issues as clearly and as concisely as possible. A simple program or Excel test file (see below) will often explain the issue better than a lot of text.
2) Provide information on your system, version of perl and module versions. The following program will generate everything that is required. Put this information in your bug report.
#!/usr/bin/perl -w
print "\n Perl version : $]";
print "\n OS name : $^O";
print "\n Module versions: (not all are required)\n";
my @modules = qw(
Spreadsheet::ParseExcel
Scalar::Util
Unicode::Map
Spreadsheet::WriteExcel
Parse::RecDescent
File::Temp
OLE::Storage_Lite
IO::Stringy
);
for my $module (@modules) {
my $version;
eval "require $module";
if (not $@) {
$version = $module->VERSION;
$version = '(unknown)' if not defined $version;
}
else {
$version = '(not installed)';
}
printf "%21s%-24s\t%s\n", "", $module, $version;
}
__END__
3) Upgrade to the latest version of Spreadsheet::ParseExcel (or at least test on a system with an upgraded version). The issue you are reporting may already have been fixed.
4) Create a small example program that demonstrates your problem. The program should be as small as possible. A few lines of codes are worth tens of lines of text when trying to describe a bug.
5) Supply an Excel file that demonstrates the problem. This is very important. If the file is big, or contains confidential information, try to reduce it down to the smallest Excel file that represents the issue. If you don't wish to post a file here then send it to me directly: jmcnamara@cpan.org
6) Say if the test file was created by Excel, OpenOffice, Gnumeric or something else. Say which version of that application you used.
7) If you are submitting a patch you should check with the maintainer whether the issue has already been patched or if a fix is in the works. Patches should be accompanied by test cases.
Asking a question
If you would like to ask a more general question there is the Spreadsheet::ParseExcel Google Group.
Owner: |
Nobody in particular
|
Requestors: |
josh.ritter [...] gmail.com
|
Cc: |
|
AdminCc: |
|
|
Severity: |
(no value)
|
Broken in: |
(no value)
|
Fixed in: |
(no value)
|
4.0.xls
parse01.pl
test.pl
|
Mon Dec 21 15:14:32 2009
josh.ritter [...] gmail.com - Ticket created
Running the latest version of ParseExcel.
The attached file has some columns formated to '0.00', and while warns
in _NewCell confirm that the FmtIdx for those cells is 2, the
$oBook->{FormatStr} does not contain that entry so the cell is created
empty.
I added a warn on line 542 of ParseExcel that looks like:
warn $oBook->{Format}[$iF]->{FmtIdx};
To confirm that my number fields do have the correct formatting. And
then ran the test.pl file attached. So it seems like whatever setup is
filling the FormatStr is not catching that format.
Taking the hash that exists in ParseExcel::FmtDefault called
%hFmtDefault and dumping that into FormatStr did make things work, but
that is obviously a poor solution.
Message body not shown because it is not plain text.
use strict;
use warnings;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $book = $parser->parse('/home/jdr99/Dropbox/4.0.xls');
use Data::Dumper;
warn Dumper $book->{FormatStr};
1;
Mon Dec 21 16:06:26 2009
jmcnamara [...] cpan.org - Correspondence added
On Mon Dec 21 15:14:32 2009, josh.ritter wrote:
Show quoted text> Running the latest version of ParseExcel.
>
> The attached file has some columns formated to '0.00', and while warns
> in _NewCell confirm that the FmtIdx for those cells is 2, the
> $oBook->{FormatStr} does not contain that entry so the cell is created
> empty.
Hi Josh,
Thanks for that. I'll look into to it and let you know.
John.
--
Mon Dec 21 16:06:27 2009
The RT System itself - Status changed from 'new' to 'open'
Tue Dec 22 08:02:16 2009
jmcnamara [...] cpan.org - Correspondence added
On Mon Dec 21 15:14:32 2009, josh.ritter wrote:
Show quoted text> Running the latest version of ParseExcel.
>
> The attached file has some columns formated to '0.00', and while warns
> in _NewCell confirm that the FmtIdx for those cells is 2, the
> $oBook->{FormatStr} does not contain that entry so the cell is created
> empty.
Hi,
Are you referring to the data in column L of the sheet "Scores"? I can
see that it is formatted to '0.00' so I am guessing that it is the
column in question.
When I run the following program (with the latest version of ParseExcel):
#!/usr/bin/perl -w
use strict;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse('4.0.xls');
if ( !defined $workbook ) {
die $parser->error(), ".\n";
}
for my $worksheet ( $workbook->worksheets() ) {
print "Worksheet name: ", $worksheet->get_name(), "\n\n";
my ( $row_min, $row_max ) = $worksheet->row_range();
my ( $col_min, $col_max ) = $worksheet->col_range();
for my $row ( $row_min .. $row_max ) {
for my $col ( $col_min .. $col_max ) {
my $cell = $worksheet->get_cell( $row, $col );
next unless $cell;
print " Row, Col = ($row, $col)\n";
print " Value = ", $cell->value(), "\n";
print " Unformatted = ", $cell->unformatted(), "\n";
print "\n";
}
}
}
__END__
I get the following (shortened) output for column L:
...
Row, Col = (2, 11)
Value = Quizzes3
Unformatted = Quizzes3
...
Row, Col = (5, 11)
Value = 98.12
Unformatted = 98.1234
...
Row, Col = (6, 11)
Value = 98.24
Unformatted = 98.2445
So I'm not seeing the issue. If that isn't what you meant let me know.
John.
--
Tue Dec 22 08:03:55 2009
jmcnamara [...] cpan.org - Correspondence added
I've attached the program because the formatting was lost above.
John.
--
#!/usr/bin/perl -w
use strict;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse('4.0.xls');
if ( !defined $workbook ) {
die $parser->error(), ".\n";
}
for my $worksheet ( $workbook->worksheets() ) {
print "Worksheet name: ", $worksheet->get_name(), "\n\n";
my ( $row_min, $row_max ) = $worksheet->row_range();
my ( $col_min, $col_max ) = $worksheet->col_range();
for my $row ( $row_min .. $row_max ) {
for my $col ( $col_min .. $col_max ) {
my $cell = $worksheet->get_cell( $row, $col );
next unless $cell;
print " Row, Col = ($row, $col)\n";
print " Value = ", $cell->value(), "\n";
print " Unformatted = ", $cell->unformatted(), "\n";
print "\n";
}
}
}
Tue Dec 22 17:40:53 2009
josh.ritter [...] gmail.com - Correspondence added
That is the data I am refering too. I agree that your script does the
right thing. The more I look at this, the more I am unsure where the
bug actually lies. I am using Spreadsheet::Read (which uses
Spreadsheet::ParseExcel)
The problem is occurring because the workbook returned does not
contain all the numeric formats used in the file. Looks like there is
a hash called 'FormatStr' that maps a code to a format.
Spreadsheet::Read is trying to fetch that format after the file has
been parsed by accessing the FmtIdx in the
Spreadsheet::ParseExcel::Format. Looking at the cells you pasted from
your script, the FmtIdx is 2, but when I try to fetch that format
string from the workbook it returns undef, instead of '0.00'.
Spreadsheet::Read may trying to fetch that improperly, but it looks
like the workbook returned from parse does not contain the definition
for that numeric format (although each cell does contain the correct
values and correct formatidx within the format object).
If you feel like it is not a ParseExcel problem, I will be happy to
open a bug with Spreadsheet::Read. :)
On Tue, Dec 22, 2009 at 5:02 AM, John McNamara via RT
<bug-Spreadsheet-ParseExcel@rt.cpan.org> wrote:
Show quoted text> <URL:
https://rt.cpan.org/Ticket/Display.html?id=52949 >
>
> On Mon Dec 21 15:14:32 2009, josh.ritter wrote:
>> Running the latest version of ParseExcel.
>>
>> The attached file has some columns formated to '0.00', and while warns
>> in _NewCell confirm that the FmtIdx for those cells is 2, the
>> $oBook->{FormatStr} does not contain that entry so the cell is created
>> empty.
>
> Hi,
>
> Are you referring to the data in column L of the sheet "Scores"? I can
> see that it is formatted to '0.00' so I am guessing that it is the
> column in question.
>
> When I run the following program (with the latest version of ParseExcel):
>
> #!/usr/bin/perl -w
>
> use strict;
> use Spreadsheet::ParseExcel;
>
> my $parser = Spreadsheet::ParseExcel->new();
> my $workbook = $parser->parse('4.0.xls');
>
> if ( !defined $workbook ) {
> die $parser->error(), ".\n";
> }
>
> for my $worksheet ( $workbook->worksheets() ) {
>
> print "Worksheet name: ", $worksheet->get_name(), "\n\n";
>
> my ( $row_min, $row_max ) = $worksheet->row_range();
> my ( $col_min, $col_max ) = $worksheet->col_range();
>
> for my $row ( $row_min .. $row_max ) {
> for my $col ( $col_min .. $col_max ) {
>
> my $cell = $worksheet->get_cell( $row, $col );
> next unless $cell;
>
> print " Row, Col = ($row, $col)\n";
> print " Value = ", $cell->value(), "\n";
> print " Unformatted = ", $cell->unformatted(), "\n";
> print "\n";
> }
> }
> }
> __END__
>
>
> I get the following (shortened) output for column L:
>
> ...
> Row, Col = (2, 11)
> Value = Quizzes3
> Unformatted = Quizzes3
>
> ...
> Row, Col = (5, 11)
> Value = 98.12
> Unformatted = 98.1234
> ...
>
> Row, Col = (6, 11)
> Value = 98.24
> Unformatted = 98.2445
>
> So I'm not seeing the issue. If that isn't what you meant let me know.
>
> John.
> --
>
>
>
>
>
>
>
>
Wed Dec 23 14:48:01 2009
jmcnamara [...] cpan.org - Correspondence added
On Tue Dec 22 17:40:53 2009, josh.ritter wrote:
Show quoted text> That is the data I am refering too. I agree that your script does the
> right thing. The more I look at this, the more I am unsure where the
> bug actually lies. I am using Spreadsheet::Read (which uses
> Spreadsheet::ParseExcel)
Hi Josh,
I don't think that this is a Spreadsheet::Read issue either. If I run the xlscat program that
comes with Spreadsheet::Read I see the required 0.00 output:
$ xlscat 4.0.xls
4.0|||||||||||||
Exported 12/16/2009 1:31 PM|||||||Category|Homework|Quizzes||||
Username|Last Name|First Name M.|Student
Number|Section/Group|Status|Notes|Assignment|Homework 1|Quizzes 1|Quizzes
2|Quizzes3|Total Score|Class Grade
|||||||Grading scale|Custom|Points|4.0 scale|Percentage|Percentage|Text
|||||||Points possible||12345||||
fournier|FOURNIER|JANICE||group!|Active||||||98.12||
jlaney|LANEY|JAMES W||dog with socks|Active|||B|||98.24||
drickngo|NGO|DERRICK T||derrick|Active|||C|||98.11||
charlon|PALACAY|CHARLON||dog with socks, char|Active|||A|||20.00||
jiranida|PHUWANARTNURAK|JIRANIDA||ammy|Active|||C|||||
dcwalker|WALKER|DANIEL CURTIS||dog with socks|Active|||C|||||
scumby|WASHINGTON|WILLIAM A||group!, dog with socks|Active|||B|||||
|||||||||||||
|||||||Mean||#DIV/0!|#DIV/0!||#DIV/0!|
|||||||Median||#NUM!|#NUM!||#NUM!|
|||||||Mode||#N/A!|#N/A!||#N/A!|
|||||||Min||0.00|0.00||0.00|
|||||||Max||0.00|0.00||0.00|
|||||||Std. Dev.||#DIV/0!|#DIV/0!||#DIV/0!|
14 x 19
Can you provide a test program that demonstrates the issue?
John.
--
Mon Jan 04 13:20:30 2010
josh.ritter [...] gmail.com - Correspondence added
It looks like spreadsheet::read does contain both the unformatted and
formatted versions of the number. Although it seems arbitrary to me.
Each sheet hash has a cell hash, which does not always contain the
formatted cell values (although it sometime seems, depending on what
version of excel the file was saved on). The traditional excel labels
(ie a13) do contain the formatted version of each cell.
I will just use the traditional labels to extract the data I need.
Thanks for your help
Josh
On Wed, Dec 23, 2009 at 11:48 AM, John McNamara via RT
<bug-Spreadsheet-ParseExcel@rt.cpan.org> wrote:
Show quoted text> <URL:
https://rt.cpan.org/Ticket/Display.html?id=52949 >
>
> On Tue Dec 22 17:40:53 2009, josh.ritter wrote:
>> That is the data I am refering too. I agree that your script does the
>> right thing. The more I look at this, the more I am unsure where the
>> bug actually lies. I am using Spreadsheet::Read (which uses
>> Spreadsheet::ParseExcel)
>
> Hi Josh,
>
> I don't think that this is a Spreadsheet::Read issue either. If I run the xlscat program that
> comes with Spreadsheet::Read I see the required 0.00 output:
>
> $ xlscat 4.0.xls
> 4.0|||||||||||||
> Exported 12/16/2009 1:31 PM|||||||Category|Homework|Quizzes||||
> Username|Last Name|First Name M.|Student
> Number|Section/Group|Status|Notes|Assignment|Homework 1|Quizzes 1|Quizzes
> 2|Quizzes3|Total Score|Class Grade
> |||||||Grading scale|Custom|Points|4.0 scale|Percentage|Percentage|Text
> |||||||Points possible||12345||||
> fournier|FOURNIER|JANICE||group!|Active||||||98.12||
> jlaney|LANEY|JAMES W||dog with socks|Active|||B|||98.24||
> drickngo|NGO|DERRICK T||derrick|Active|||C|||98.11||
> charlon|PALACAY|CHARLON||dog with socks, char|Active|||A|||20.00||
> jiranida|PHUWANARTNURAK|JIRANIDA||ammy|Active|||C|||||
> dcwalker|WALKER|DANIEL CURTIS||dog with socks|Active|||C|||||
> scumby|WASHINGTON|WILLIAM A||group!, dog with socks|Active|||B|||||
> |||||||||||||
> |||||||Mean||#DIV/0!|#DIV/0!||#DIV/0!|
> |||||||Median||#NUM!|#NUM!||#NUM!|
> |||||||Mode||#N/A!|#N/A!||#N/A!|
> |||||||Min||0.00|0.00||0.00|
> |||||||Max||0.00|0.00||0.00|
> |||||||Std. Dev.||#DIV/0!|#DIV/0!||#DIV/0!|
> 14 x 19
>
> Can you provide a test program that demonstrates the issue?
>
> John.
> --
>
>
>
>
Mon Jan 04 18:09:44 2010
jmcnamara [...] cpan.org - Correspondence added
Hi Josh,
In that case I will mark it as resolved.
If you have any other issues let me know.
John.
--
Mon Jan 04 18:09:45 2010
jmcnamara [...] cpan.org - Status changed from 'open' to 'resolved'