Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Spreadsheet-ParseExcel CPAN distribution.

Maintainer(s)' notes

If you are reporting a bug in Spreadsheet::ParseExcel here are some pointers

1) State the issues as clearly and as concisely as possible. A simple program or Excel test file (see below) will often explain the issue better than a lot of text.

2) Provide information on your system, version of perl and module versions. The following program will generate everything that is required. Put this information in your bug report.

    #!/usr/bin/perl -w

    print "\n    Perl version   : $]";
    print "\n    OS name        : $^O";
    print "\n    Module versions: (not all are required)\n";

    my @modules = qw(
                      Spreadsheet::ParseExcel
                      Scalar::Util
                      Unicode::Map
                      Spreadsheet::WriteExcel
                      Parse::RecDescent
                      File::Temp
                      OLE::Storage_Lite
                      IO::Stringy
                    );

    for my $module (@modules) {
        my $version;
        eval "require $module";

        if (not $@) {
            $version = $module->VERSION;
            $version = '(unknown)' if not defined $version;
        }
        else {
            $version = '(not installed)';
        }

        printf "%21s%-24s\t%s\n", "", $module, $version;
    }

    __END__

3) Upgrade to the latest version of Spreadsheet::ParseExcel (or at least test on a system with an upgraded version). The issue you are reporting may already have been fixed.

4) Create a small example program that demonstrates your problem. The program should be as small as possible. A few lines of codes are worth tens of lines of text when trying to describe a bug.

5) Supply an Excel file that demonstrates the problem. This is very important. If the file is big, or contains confidential information, try to reduce it down to the smallest Excel file that represents the issue. If you don't wish to post a file here then send it to me directly: jmcnamara@cpan.org

6) Say if the test file was created by Excel, OpenOffice, Gnumeric or something else. Say which version of that application you used.

7) If you are submitting a patch you should check with the maintainer whether the issue has already been patched or if a fix is in the works. Patches should be accompanied by test cases.

Asking a question

If you would like to ask a more general question there is the Spreadsheet::ParseExcel Google Group.

Report information
The Basics
Id: 30677
Status: resolved
Priority: 0/
Queue: Spreadsheet-ParseExcel

People
Owner: Nobody in particular
Requestors: cpan [...] pjedwards.co.uk
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.33
Fixed in: (no value)



Subject: 0xEF PHONETIC in BIFF8
Thanks for working on Spreadsheet::ParseExcel 0xEF is PHONETIC in BIFF8 See: http://sc.openoffice.org/excelfileformat.pdf I can't find when it may have been an EOF, but in sub Parse in ParseExcel.pm this code: #Check EF, EOF if($bOp == 0xEF) { #EF $iEfFlg = $bOp; exists, and it's taken me a while to get here, but it's the reason a file I have from a vendor (that I can't attach here, but could email) does not parse. I am able to open the file correctly in MS Excel, OpenOffice and Gnumeric. My fix is to remove the check for 0xEF. Cheers, Peter (Stig) Edwards
On Mon Nov 12 13:55:23 2007, cpan@pjedwards.co.uk wrote: Show quoted text
> Thanks for working on Spreadsheet::ParseExcel > > 0xEF is PHONETIC in BIFF8 > See: http://sc.openoffice.org/excelfileformat.pdf > I can't find when it may have been an EOF, but in sub Parse in > ParseExcel.pm this code: > > #Check EF, EOF > if($bOp == 0xEF) { #EF > $iEfFlg = $bOp; > > exists, and it's taken me a while to get here, but it's the reason a > file I have from a vendor (that I can't attach here, but could email) > does not parse. I am able to open the file correctly in MS Excel, > OpenOffice and Gnumeric. My fix is to remove the check for 0xEF.
Hi Peter, I am the new maintainer of Spreadsheet::WriteExcel. I know that it has been a while since you reported this but if you could send me the/an Excel file that demonstrates this problem I'll have a look at it. John. --
On Mon Jan 12 08:15:24 2009, JMCNAMARA wrote: Show quoted text
> On Mon Nov 12 13:55:23 2007, cpan@pjedwards.co.uk wrote:
> > Thanks for working on Spreadsheet::ParseExcel > > > > 0xEF is PHONETIC in BIFF8 > > See: http://sc.openoffice.org/excelfileformat.pdf > > I can't find when it may have been an EOF, but in sub Parse in > > ParseExcel.pm this code: > > > > #Check EF, EOF > > if($bOp == 0xEF) { #EF > > $iEfFlg = $bOp; > > > > exists, and it's taken me a while to get here, but it's the reason a > > file I have from a vendor (that I can't attach here, but could email) > > does not parse. I am able to open the file correctly in MS Excel, > > OpenOffice and Gnumeric. My fix is to remove the check for 0xEF.
> > Hi Peter, > > I am the new maintainer of Spreadsheet::WriteExcel.
Or Spreadsheet::rather ParseExcel. Sorry, muscle memory. :-) John.
I managed to find the file, I'll email it to you. I've attached to this update a version of xls2csv.pl used to demonstrate the problem, call it thus: perl xls2csv.pl rt_30677.xls 3-A1:H10 Compare this to the contents of sheet 3 if opened in MS Excel, OpenOffice, Gnumeric. Also with the check for 0xEF commented out in ParseExcel.pm make perl -Mblib xls2csv.pl rt_30677.xls 3-A1:H10 Thanks.
# script to extract a range of colums/rows from an excel spreadsheet # and present it as a csv, there is also the option to rotate the # output, # # (c) kevin Mulholland 2002, kevin@moodfarm.demon.co.uk # this code is released under the Perl Artistic License use strict ; use Spreadsheet::ParseExcel; use Spreadsheet::ParseExcel::FmtUnicode; use Spreadsheet::ParseExcel::Utility qw(xls2csv); if(!(defined $ARGV[0])) { usage( "Bad Args") ; exit; } my $rotate = 0 ; my $filename = $ARGV[0] ; my $coords = $ARGV[1] ; $rotate = 1 if( defined $ARGV[2] && $ARGV[2] eq "-rotate") ; my $codepage = $ARGV[3]; if( !$coords) { usage( "No co-ordinates defined") ; exit ; } if( ! -f $filename) { usage( "File $filename does not exist") ; exit ; } printf xls2csv( $filename, $coords, $rotate, $codepage) ; # ----------------------------------------------------------------------------- # error # writes a message to STDERR # sub error { printf STDERR shift ; } # ----------------------------------------------------------------------------- # usage # this decribes how the program as a whole is to be used # sub usage { my $errmsg = shift ; error( "\n" . $errmsg . "\n") if( $errmsg) ; error( " Usage: $0 filename sheet-colrow:colrow [-rotate] eg: $0 filename.xls 1-A1:B12 $0 filename.xls A1:M1 -rotate\n\n") ; }
Just looking at my local edits to Spreadsheet-ParseExcel- 0.32/lib/Spreadsheet/ParseExcel.pm I also added some entries to %ProcTbl and corresponding subroutines when working on this. (The 3 #s are because at the time I was using Smart::Comments) Here they are, they might be helpful, sorry I've not looked at any more recent versions than 0.32. 0x00 => \&_subDimensions, # Dimensions 0x01 => \&_subBlank, # Blank 0x02 => \&_subInteger, # Integer 0x03 => \&_subNumber, # Number 0x04 => \&_subLabel, # Label 0x05 => \&_subBoolErr, # Bool Err 0x06 => \&_subFormula, # Formula 0x09 => \&_subBOF, # BOF(BIFF2) 0x0A => \&_subEOF, # EOF 0x0B => \&_subIndex, # Index 0x0C => \&_subCalcCount, # Calc Count 0x0D => \&_subCalcMode, # Calc Mode 0x0E => \&_subPrecision, # Precision 0x0F => \&_subRefMode, # Ref Mode 0x10 => \&_subDelta, # Delta 0x11 => \&_subIteration, # Iteration 0x12 => \&_subProtect, # Protect 0x13 => \&_subPassword, # Password 0x16 => \&_subExternCount, # Extern Count 0x17 => \&_subExternSheet, # Extern Sheet 0x19 => \&_subWindowProtect, # Window Protect 0x1C => \&_subNote, # Note 0x1D => \&_subSelection, # Selection 0x1E => \&_subFormat, # Format 0x1F => \&_subBuiltInFmtCount, # Built in Format Count 0x20 => \&_subColumnDefualt, # Column Default 0x21 => \&_subArray, # Array 0x23 => \&_subExternName, # Extern Name 0x24 => \&_subColWidth, # Col Width 0x25 => \&_subDefaultRowHeight, # Default Row Height 0x2F => \&_subFilePass, # File Pass 0x31 => \&_subFont, # Font 0x32 => \&_subFont2, # Font2 0x36 => \&_subTableOp, # TableOp 0x37 => \&_subTableOp2, # TableOp2 0x3D => \&_subWindow1, # Window1 0x3E => \&_subWindow2, # Window2 0x40 => \&_subBackup, # Backup 0x41 => \&_subPane, # Pane 0x42 => \&_subCodepage, # Codepage 0x44 => \&_subIXFE, # IXFE 0x45 => \&_subEFont, # EFont 0x51 => \&_subDCONREF, # DCONREF 0x56 => \&_subBuiltInFmtCount, # Built in Fmt Count 0x59 => \&_subXCT, # XCT 0x5A => \&_subCRN, # CRN 0x5B => \&_subFileSharing, # File Sharing 0x5E => \&_subUnCalced, # UnCalced 0x5F => \&_subSaverECalc, # Saver ECalc 0x63 => \&_subOjectProtect, # Object Protect 0x80 => \&_subGUTS, # GUTS 0x82 => \&_subGridSet, # Grid Set 0x86 => \&_subWriteProt, # Write Prot 0x8C => \&_subCountry, # Country 0x8D => \&_subHideObj, # Hide Object 0x90 => \&_subSort, # Sort 0xA0 => \&_subSCL, # SCL 0xAB => \&_subGCW, # GCW 0xD7 => \&_subDBCell, # DBCell 0xDA => \&_subBookBool, # BookBool 0xDD => \&_subScenProtect, # Scen Protect 0xE9 => \&_subBitMap, # Bit Map 0xEF => \&_subPhonetic, # Phonetic 0xFF => \&_subEXTSST, # EXTSST 0x15F => \&_subLabelRanges, # Label Ranges 0x160 => \&_subUseSelfs, # Use Selfs 0x161 => \&_subDSF, # DSF 0x1AE => \&_subSupBook, # Sup Book 0x1B0 => \&_subCondFmt, # Cond Fmt 0x1B2 => \&_subDVAL, # DVAL 0x1B8 => \&_subHLINK, # HLINK 0x1BE => \&_subDV, # DV 0x200 => \&_subDimensions, # Dimensions 0x206 => \&_subFormula, # Formula 0x209 => \&_subBOF, # BOF(BIFF3) 0x20B => \&_subIndex, # Index 0x218 => \&_subName, # Name 0x223 => \&_subExternName, # Extern Name 0x231 => \&_subFont, # Font 0x236 => \&_subTableOp, # TableOp 0x23E => \&_subWindow2, # Window2 0x243 => \&_subXF, # XF 0x293 => \&_subStyle, # Style 0x409 => \&_subBOF, # BOF(BIFF4) 0x41E => \&_subFormat, # Format 0x443 => \&_subXF, # XF 0x4BC => \&_subShrFmla, # Shared Formula 0x800 => \&_subQuickTip, # Quick Tip 0x862 => \&_subSheetLayout, # Sheet Layout 0x867 => \&_subSheetProtection, # Sheet Protection 0x868 => \&_subRangeProtection, # Range Protection sub _subCodepage { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCodepage() } sub _subDSF { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subDSF() } sub _subWindowProtect { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subWindowProtect() } sub _subWindow1 { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subWindow1() } sub _subBackup { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subBackup() } sub _subHideObj { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subHideObj() } sub _subBookBool { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subBookBool() } sub _subStyle { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subStyle() } sub _subUseSelfs { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subUseSelfs() } sub _subCountry { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCountry() } sub _subSupBook { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSupBook() } sub _subExternSheet { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subExternSheet() } sub _subEXTSST { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subEXTSST() } sub _subIndex { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subIndex() } sub _subSaverECalc { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSaverECalcb() } sub _subGridSet { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subGridSet() } sub _subGUTS { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subGUTS() } sub _subProtect { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subProtect() } sub _subPassword { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subPassword() } sub _subPrecision { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subPrecision() } sub _subEOF { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subEOF() } sub _subCalcMode { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCalcMode() } sub _subCalcCount { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCalcCount() } sub _subRefMode { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subRefMode() } sub _subIteration { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subIteration() } sub _subDelta { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subDelta() } sub _subDimensions { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subDimensions() } sub _subDBCell { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subDBCell() } sub _subWindow2 { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subWindow2() } sub _subSCL { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSCL() } sub _subSelection { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSelection() } sub _subSort { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSort) } sub _subPane { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subPane() } sub _subGCW { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subGCW() } sub _subBitMap { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subBitMap() } sub _subCRN { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCRN() } sub _subColWidth { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subColWidth() } sub _subColumnDefault { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subColumnDefault() } sub _subCondFmt { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCondFmt() } sub _subCONREF { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subCONREF() } sub _subDV { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subDV() } sub _subDVAL { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subDVAL() } sub _subEFont { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subEFont() } sub _subExternCount { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subExternCount() } sub _subFilePass { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subFilePass() } sub _subFileSharing { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subFileSharing() } sub _subFont2 { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subFont2() } sub _subHLINK { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subHLINK() } sub _subIXFE { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subIXFE() } sub _subLabelRanges { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subLabelRanges() } sub _subNote { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subNote() } sub _subObjectProtect { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subObjectProtect() } sub _subPhonetic { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subPhonetic() } sub _subQuickTip { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subQuickTip() } sub _subRangeProtection { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subRangeProtection() } sub _subSaverECalcb { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSaverECalcb() } sub _subScenProtect { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subScenProtect() } sub _subSheetLayout { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSheetLayout() } sub _subSheetProtection { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subSheetProtection() } sub _subShrFmla { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subShrFmla() } sub _subTableOp2 { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subTableOp2() } sub _subUnCalced { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subUnCalced() } sub _subWriteProt { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subWriteProt() } sub _subXCT { my($oBook, $bOp, $bLen, $sWk) = @_; ### _subXCT() }
Hi Peter, Thanks for that. I tested the file that you sent against the latest version of Spreadsheet::ParseExcel and it seemed to parse correctly. Can you test it and let me know. #!/usr/bin/perl use strict; use warnings; use Spreadsheet::ParseExcel 0.44; my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->Parse('rt_30677.xls'); for my $worksheet ( $workbook->worksheets() ) { print "Sheet = ", $worksheet->{Name}, "\n"; my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); for my $row ( $row_min .. $row_max ) { for my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); next unless $cell; print " Row, Col = ($row, $col), ", $cell->value(), "\n"; print " Value = ", $cell->value(), "\n"; print " Unformatted = ", $cell->unformatted(), "\n"; print "\n"; } } } __END__ John. --
Sorry for the lack of detail. Slightly modified to narrow down output: cat rt_30677.pl #!/usr/bin/perl use strict; use warnings; use Spreadsheet::ParseExcel 0.44; my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->Parse('rt_30677.xls'); for my $worksheet ( $workbook->worksheets() ) { next if (2 != $worksheet); print "Sheet = ", $worksheet->{Name}, "\n"; my ( $row_min, $row_max ) = ( 0, 8 ); my ( $col_min, $col_max ) = ( 0, 5 ); for my $row ( $row_min .. $row_max ) { for my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); next unless $cell; print " Row, Col = ($row, $col), ", $cell->value(),"\n"; print " Value = ", $cell->value(), "\n"; print " Unformatted = ", $cell->unformatted(), "\n"; print "\n"; } } } __END__ Using original version of 0.44, the last data from the script above is row 3, col 0. Comment out line 230 of lib/Spreadsheet/ParseExcel.pm #Check EF, EOF if ( $bOp == 0xEF ) { #EF #$iEfFlg = $bOp; # <- line 230 } make make test perl -Mblib rt_30677.pl Now I see data for: Row, Col = (4, 2) Row, Col = (6, 5) Row, Col = (7, 1) Row, Col = (7, 2) Row, Col = (7, 3) Row, Col = (7, 4) Row, Col = (7, 5) Row, Col = (8, 1) Row, Col = (8, 2) Row, Col = (8, 3) Row, Col = (8, 4) Row, Col = (8, 5) Using MS Excel, OpenOffice and Gnumeric all also displayed this data for sheet 2 (0 based counting, sheet 3 if starting with 1). Thanks again.
On Mon Jan 12 11:58:20 2009, cpan@pjedwards.co.uk wrote: Show quoted text
> Sorry for the lack of detail.
Hi Peter, That's fine. I'd initially checked the last worksheet of workbook, which (sod's law) doesn't exhibit the problem so I'd missed it. :-( Anyway, thanks for that. You are also correct that the problem is with the 0xEF handling. I'll fix it in the next release. One last question. Do you know if the file was initially produced by Excel, OpenOffice.org, Gnumeric, programiatically or otherwise? John. --
On Mon Jan 12 18:30:10 2009, JMCNAMARA wrote: Show quoted text
> Do you know if the file was initially produced by > Excel, OpenOffice.org, > Gnumeric, programiatically or otherwise?
I don't know, but when I look at File->Properties in Microsoft Office Excel 2003 I see some info in the General, Summary and Statistics tabs, and when I view the raw binary file I do see "Microsoft Excel", in additon to "Microsoft Photo Editor 3.0 Picture", "MSPhotoEditor" and "VBAProj", I also see 2 empty macros when using OpenOffice, these are listed as Microsoft Excel Objects in Excel 2003. I'm guessing the file is produced programmatically, it is produced at least once a day, possibly using VBA.
Hi, This issue has been fixed in Spreadsheet::ParseExcel version 0.45. I think the original author may have been trying to avoid certain worksheet records that fell between PHONETICINFO record and the worksheet EOF. In older versions of Excel these 2 records were close together. Or else he was trying to avoid some snafu between PHONETICINFO and Unicode formula STRING records. Either way it was badly wrong for the more general case. Thanks for your help, John. --