Bug #37833 for IO-Compress-Zlib: "Out of memory error" unzipping file in IO::Uncompress::Gunzip

Mon Jul 21 21:04:20 2008 Peter.Lamb [...] csiro.au - Ticket created

Subject:	"Out of memory error" unzipping file in IO::Uncompress::Gunzip
Date:	Tue, 22 Jul 2008 11:01:09 +1000
To:	<bug-IO-Compress-Zlib [...] rt.cpan.org>
From:	<Peter.Lamb [...] csiro.au>

Message body is not shown because sender requested not to inline it.

Download linux.bin.gz
application/x-gzip 2.8m

Message body not shown because it is not plain text.

Distribution: IO-Compress-Zlib-2.012 Perl version: This is perl, v5.10.0 built for cygwin-thread-multi-64int (with 6 registered patches, see perl -V for more detail) OS: CYGWIN_NT-5.1 PREGIO-BT 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin (running in Win XP 2002 SP 2) When I try to ungzip the attached gzip file with IO::Uncompress::Gunzip using the attached script, I get: Out of memory during "large" request for 536875008 bytes, total sbrk() is 269887488 bytes at /usr/lib/perl5/5.10/i686-cygwin/IO/Uncompress/Adapter/Inflate.pm line 60. I get the same error with similar related files. The attached gzip file uncompresses correctly with "gunzip -c" (gunzip version 1.3.12). The file as uncompressed with "gunzip -c" is 20681384 bytes. Any help would be appreciated. Regards, Peter Lamb

Tue Jul 22 05:02:12 2008 pmqs [...] cpan.org - Correspondence added

On Mon Jul 21 21:04:20 2008, Peter.Lamb@csiro.au wrote: Show quoted text

> Distribution: IO-Compress-Zlib-2.012 > > Perl version: This is perl, v5.10.0 built for cygwin-thread-multi-64int > (with 6 registered patches, see perl -V for more detail) > > OS: CYGWIN_NT-5.1 PREGIO-BT 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 > Cygwin (running in Win XP 2002 SP 2) > > > > When I try to ungzip the attached gzip file with IO::Uncompress::Gunzip > using the attached script, I get: > > > > Out of memory during "large" request for 536875008 bytes, total sbrk() > is 269887488 bytes at > /usr/lib/perl5/5.10/i686-cygwin/IO/Uncompress/Adapter/Inflate.pm line > 60. > > > > I get the same error with similar related files. The attached gzip file > uncompresses correctly with "gunzip -c" (gunzip version 1.3.12). The > file as uncompressed with "gunzip -c" is 20681384 bytes. >

Hi Peter I tried your test script on a Linux box, a Solaris box and a Windows box. All worked fine. I don't have access to a Cygwin box. We need to see if the problem is with the uncompression or the way filehandles work on Cygwin. Can you try running this and report back on what you get please? #!/usr/bin/perl use strict; use warnings; use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ; my $gz = new IO::Uncompress::Gunzip 'linux.bin.gz' or die "Cannot gunzip: $GunzipError\n"; my $buffer; my $out; while ($gz->read($buffer)) { $out += length $buffer; } print "Got $out buyes\n"; cheers Paul

Tue Jul 22 05:02:39 2008 The RT System itself - Status changed from 'new' to 'open'

Tue Jul 22 19:28:14 2008 Peter.Lamb [...] csiro.au - Correspondence added

Subject:	RE: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip
Date:	Wed, 23 Jul 2008 09:27:02 +1000
To:	<bug-IO-Compress-Zlib [...] rt.cpan.org>
From:	<Peter.Lamb [...] csiro.au>

Hi Paul, thanks for getting back to me so quickly. I took the liberty of moving your print statement into the read loop in the code you posted, because otherwise all you see is the "Out of memory" message as with my original post; I also set autoflush on STDOUT. The script as I ran it and the output are attached. I had tried doing much the same myself to see if I could get a handle on the problem, but I got nowhere. The uncompressed file is about 20MB; the ungzip dies about 2.5MB into the uncompressed stream. ./gunzipbug1.pl linux.bin.gz > gunzipbug1.out 2>&1 If you want to try the problem yourself, you can run Cygwin on your Windows box (free download from http://www.cygwin.com/); it's a Unix compatibility shell for Windows, and I'm running it under Win XP 2002 SP2. I don't know why you got two copies of the bug report; I only have a record off a single email message to bug-IO-Compress-Zlib@rt.cpan.org, but I did get two automated responses with the two ticket numbers 37833 and 37834. Sorry for any inconvenience. Anyway, if there are more tests you'd like me to run, please let me know. Cheers, Peter Dr Peter Lamb Project Leader Information Engineering Laboratory CSIRO ICT Centre Innovative ICT transforming Australian industries Post: PO Box 664, Canberra, ACT 2601, Australia Office: Computer Science & Information Technology Building (Building 108) Australian National University, Acton, ACT 2601 T: +61 2 6216 7047, F: +61 2 6216 7111 www.ict.csiro.au Show quoted text

> -----Original Message----- > From: Paul Marquess via RT [mailto:bug-IO-Compress-Zlib@rt.cpan.org] > Sent: Tuesday, 22 July 2008 19:03 > To: Lamb, Peter (ICT Centre, Acton) > Subject: [rt.cpan.org #37833] "Out of memory error" unzipping file in > IO::Uncompress::Gunzip > > <URL: http://rt.cpan.org/Ticket/Display.html?id=37833 > > > On Mon Jul 21 21:04:20 2008, Peter.Lamb@csiro.au wrote:

> > Distribution: IO-Compress-Zlib-2.012 > > > > Perl version: This is perl, v5.10.0 built for

cygwin-thread-multi-64int Show quoted text

> > (with 6 registered patches, see perl -V for more detail) > > > > OS: CYGWIN_NT-5.1 PREGIO-BT 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 > > Cygwin (running in Win XP 2002 SP 2) > > > > > > > > When I try to ungzip the attached gzip file with

IO::Uncompress::Gunzip Show quoted text

> > using the attached script, I get: > > > > > > > > Out of memory during "large" request for 536875008 bytes, total

sbrk() Show quoted text

> > is 269887488 bytes at > > /usr/lib/perl5/5.10/i686-cygwin/IO/Uncompress/Adapter/Inflate.pm

line Show quoted text

> > 60. > > > > > > > > I get the same error with similar related files. The attached gzip

file Show quoted text

> > uncompresses correctly with "gunzip -c" (gunzip version 1.3.12). The > > file as uncompressed with "gunzip -c" is 20681384 bytes. > >

> > Hi Peter > > I tried your test script on a Linux box, a Solaris box and a Windows > box. All worked fine. I don't have access to a Cygwin box. > > We need to see if the problem is with the uncompression or the way > filehandles work on Cygwin. Can you try running this and report back

on Show quoted text

> what you get please? > > #!/usr/bin/perl > > use strict; > use warnings; > > use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ; > > my $gz = new IO::Uncompress::Gunzip 'linux.bin.gz' > or die "Cannot gunzip: $GunzipError\n"; > > > my $buffer; > my $out; > > while ($gz->read($buffer)) > { > $out += length $buffer; > } > > print "Got $out buyes\n"; > > cheers > Paul > > > >

Download gunzipbug1.out
application/octet-stream 1.3k

Message body not shown because it is not plain text.

Message body is not shown because sender requested not to inline it.

Thu Jul 24 04:53:39 2008 pmqs [...] cpan.org - Correspondence added

Hi Peter Show quoted text

> I took the liberty of moving your print statement into the read loop in > the code you posted, because otherwise all you see is the "Out of > memory" message as with my original post; I also set autoflush on > STDOUT. The script as I ran it and the output are attached. I had tried > doing much the same myself to see if I could get a handle on the > problem, but I got nowhere. The uncompressed file is about 20MB; the > ungzip dies about 2.5MB into the uncompressed stream. > > ./gunzipbug1.pl linux.bin.gz > gunzipbug1.out 2>&1 > > If you want to try the problem yourself, you can run Cygwin on your > Windows box (free download from http://www.cygwin.com/); it's a Unix > compatibility shell for Windows, and I'm running it under Win XP 2002 > SP2.

I've used cygwin before, and if I had enough space on my work laptop I'd install it. :-) Show quoted text

> I don't know why you got two copies of the bug report; I only have a > record off a single email message to bug-IO-Compress-Zlib@rt.cpan.org, > but I did get two automated responses with the two ticket numbers 37833 > and 37834. Sorry for any inconvenience.

No problem. Show quoted text

> Anyway, if there are more tests you'd like me to run, please let me > know.

The output you've sent me has given me enough info to spot the problem. ... Got 2554049 buyes Out of memory during "large" request for 536875008 bytes... If I look at the output from running a slight variant on the same script (I output the number of bytes uncompressed in each call to read) on a Linux box this is what I see ... Got 52941 bytes -> total 2554049 bytes Got 10499407 bytes -> total 13053456 bytes Notice the size of the uncompressed data in the call after the last successful call you got. It looks like cygwin is failing when it is trying to create a 10 meg output buffer. Not sure why cygwin reports a request for 500meg though. So here is where I think the problem lies - at the moment my code reads the compressed data in 4k chunks. It will carry out uncompression on that input buffer until it is exhausted, regardless of how much uncompressed data that will generate - the output buffer will be grown if needed. So basically the output buffer size is unbounded. The obvious fix for this is for me to make the output buffer size bounded. Unfortunately that will take a bit of work on my part. In the interim if you just want to uncompress a file you can use this system "gunzip -c $inputfile >$outputfile"; Or if you need to process the contents, this open F, "gunzip -c $inputfile|"; while (<F>) { # do something } Are those workarounds good enough for your purposes? Paul

Thu Jul 24 20:46:00 2008 Peter.Lamb [...] csiro.au - Correspondence added

Subject:	RE: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip
Date:	Fri, 25 Jul 2008 10:45:24 +1000
To:	<bug-IO-Compress-Zlib [...] rt.cpan.org>
From:	<Peter.Lamb [...] csiro.au>

Hi Paul. Thanks for the analysis of the problem. I think I may be able to help you create files that will let you experiment with the problem more if you want. The linux.bin.gz file is the uClinux kernel for a Beyonwiz PVR, and it has the PVR's root file system embedded in it as a ROMFS. The "real" code that I'm running extracts and unpacks the ROMFS (and also lets me pack a new ROMFS into the space where the old one sat). The root ROMFS contains a 10MB file, /bank0, containing all null bytes. My understanding is that it's mmapped to provide an allocation arena for malloc. I think that it's the unpacking of this large contiguous chunk of null bytes that's forcing the large buffer size in Gunzip. I don't know why the malloc of that in perl is choking. I suspect that it may be possible to create a file that contains a large null-ed out block to trigger the same problem. Unfortunately, a file made just by compressing a single 10MB chunk of zeros unpacks just fine: dd if=/dev/zero bs=1k count=10240 | gzip -c > null.gz Unfortunately, your workaround won't do the job I need. I want the code to be able to run on a Windows machine that doesn't have g[un]zip. Fortunately, the workaround in the attached script does work. However, there doesn't seem to be any way to pass the read() buffer length through to Gunzip::gunzip(). The workaround also works in my "real" code. Thanks for your help. Peter Dr Peter Lamb Project Leader Information Engineering Laboratory CSIRO ICT Centre Innovative ICT transforming Australian industries Post: PO Box 664, Canberra, ACT 2601, Australia Office: Computer Science & Information Technology Building (Building 108) Australian National University, Acton, ACT 2601 T: +61 2 6216 7047, F: +61 2 6216 7111 www.ict.csiro.au Show quoted text

> -----Original Message----- > From: Paul Marquess via RT [mailto:bug-IO-Compress-Zlib@rt.cpan.org] > Sent: Thursday, 24 July 2008 18:54 > To: Lamb, Peter (ICT Centre, Acton) > Subject: [rt.cpan.org #37833] "Out of memory error" unzipping file in > IO::Uncompress::Gunzip > > <URL: http://rt.cpan.org/Ticket/Display.html?id=37833 > > > Hi Peter >

> > I took the liberty of moving your print statement into the read loop

in Show quoted text

> > the code you posted, because otherwise all you see is the "Out of > > memory" message as with my original post; I also set autoflush on > > STDOUT. The script as I ran it and the output are attached. I had

tried Show quoted text

> > doing much the same myself to see if I could get a handle on the > > problem, but I got nowhere. The uncompressed file is about 20MB; the > > ungzip dies about 2.5MB into the uncompressed stream. > > > > ./gunzipbug1.pl linux.bin.gz > gunzipbug1.out 2>&1 > > > > If you want to try the problem yourself, you can run Cygwin on your > > Windows box (free download from http://www.cygwin.com/); it's a Unix > > compatibility shell for Windows, and I'm running it under Win XP

2002 Show quoted text

> > SP2.

> > I've used cygwin before, and if I had enough space on my work laptop

I'd Show quoted text

> install it. :-) >

> > I don't know why you got two copies of the bug report; I only have a > > record off a single email message to

bug-IO-Compress-Zlib@rt.cpan.org, Show quoted text

> > but I did get two automated responses with the two ticket numbers

37833 Show quoted text

> > and 37834. Sorry for any inconvenience.

> > No problem. >

> > Anyway, if there are more tests you'd like me to run, please let me > > know.

> > The output you've sent me has given me enough info to spot the

problem. Show quoted text

> > ... > Got 2554049 buyes > Out of memory during "large" request for 536875008 bytes... > > If I look at the output from running a slight variant on the same

script Show quoted text

> (I output the number of bytes uncompressed in each call to read) on a > Linux box this is what I see > > ... > Got 52941 bytes -> total 2554049 bytes > Got 10499407 bytes -> total 13053456 bytes > > Notice the size of the uncompressed data in the call after the last > successful call you got. It looks like cygwin is failing when it is > trying to create a 10 meg output buffer. Not sure why cygwin reports a > request for 500meg though. > > So here is where I think the problem lies - at the moment my code

reads Show quoted text

> the compressed data in 4k chunks. It will carry out uncompression on > that input buffer until it is exhausted, regardless of how much > uncompressed data that will generate - the output buffer will be grown > if needed. So basically the output buffer size is unbounded. > > The obvious fix for this is for me to make the output buffer size > bounded. Unfortunately that will take a bit of work on my part. > > In the interim if you just want to uncompress a file you can use this > > system "gunzip -c $inputfile >$outputfile"; > > Or if you need to process the contents, this > > open F, "gunzip -c $inputfile|"; > while (<F>) > { > # do something > } > > Are those workarounds good enough for your purposes? > > Paul >

Message body is not shown because sender requested not to inline it.

Mon Apr 06 08:09:38 2009 pmqs [...] cpan.org - Correspondence added

Hi Peter, just in case you are still interested in this issue, I've uploaded a new version of IO::Uncompress::Gunzip (version 2.017) that fixes the problem. Note - IO::Uncompress::Gunzip now lives in the IO-Compress distribution. I'm going to close this issue, so if you do find a problem, please drop me a line or create a new issue. cheers Paul

Mon Apr 06 08:25:49 2009 pmqs [...] cpan.org - Status changed from 'open' to 'resolved'

Mon Apr 27 21:25:40 2009 Peter.Lamb [...] csiro.au - Correspondence added

Subject:	RE: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip
Date:	Tue, 28 Apr 2009 11:25:17 +1000
To:	<bug-IO-Compress-Zlib [...] rt.cpan.org>
From:	<Peter.Lamb [...] csiro.au>

Thanks, Paul. I'll try updating my version of IO::Uncompress::Gunzip and see if I can remove my workarounds for this problem. I've also noticed that the Perl version of IO::Uncompress::Gzip I was using doesn't compress as well as the gzip program does for the same compression level settings. I've had to work around that by using "system 'gzip', ..." because the application is in compressing firmware for a flash device where space is tight, and I can't afford having the compressed file grow by more than a few kB. It currently is several tens of kB bigger (total compressed size about 8MB). Any ideas? Regards, Peter Dr Peter Lamb Project Leader Information Engineering Laboratory CSIRO ICT Centre Innovative ICT transforming Australian industries Post: PO Box 664, Canberra, ACT 2601, Australia Office: Computer Science & Information Technology Building (Building 108) Australian National University, Acton, ACT 2601 T: +61 2 6216 7047, F: +61 2 6216 7111 www.ict.csiro.au Show quoted text

-----Original Message----- From: Paul Marquess via RT [mailto:bug-IO-Compress-Zlib@rt.cpan.org] Sent: Monday, 6 April 2009 22:10 To: Lamb, Peter (ICT Centre, Acton) Subject: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip <URL: https://rt.cpan.org/Ticket/Display.html?id=37833 > Hi Peter, just in case you are still interested in this issue, I've uploaded a new version of IO::Uncompress::Gunzip (version 2.017) that fixes the problem. Note - IO::Uncompress::Gunzip now lives in the IO-Compress distribution. I'm going to close this issue, so if you do find a problem, please drop me a line or create a new issue. cheers Paul

Mon Apr 27 21:25:42 2009 The RT System itself - Status changed from 'resolved' to 'open'

Tue Apr 28 03:12:35 2009 pmqs [...] cpan.org - Correspondence added

On Mon Apr 27 21:25:40 2009, Peter.Lamb@csiro.au wrote: Show quoted text

> Thanks, Paul. I'll try updating my version of IO::Uncompress::Gunzip > and see if I can remove my workarounds for this problem.

Thanks - if you have any issues with the changes, please get back to me. Show quoted text

> I've also noticed that the Perl version of IO::Uncompress::Gzip I was > using doesn't compress as well as the gzip program does for the > same compression level settings. I've had to work around that by > using "system 'gzip', ..." because the application is in > compressing firmware for a flash device where space is tight, and I > can't afford having the compressed file grow by more than a few kB. > It currently is several tens of kB bigger (total compressed size > about 8MB). > > Any ideas?

Have you set the "Level" option when creating the gzip object? my $gz = new IO::Uncompress::Gunzip 'linux.bin.gz', Level => Z_BEST_COMPRESSION or die "Cannot gunzip: $GunzipError\n";

Tue Apr 28 03:16:46 2009 pmqs [...] cpan.org - Correspondence added

Damn - accidentally pressed send before I had finished the reply. I assume you set the "Level" option when creating the gzip object? my $gz = new IO::Compress::Gzip 'linux.bin.gz', Level => Z_BEST_COMPRESSION or die "Cannot gunzip: $GzipError\n"; Paul

Tue Apr 28 03:24:56 2009 Peter.Lamb [...] csiro.au - Correspondence added

Subject:	RE: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip
Date:	Tue, 28 Apr 2009 17:17:29 +1000
To:	<bug-IO-Compress-Zlib [...] rt.cpan.org>
From:	<Peter.Lamb [...] csiro.au>

Hi, Paul. Thanks for getting back. Yes, I have both tried leaving the level unset, and compared it with gzip with the level unset, and tried setting it to the maximum (I used the numeric 9 rather than the symbolic value) and compared it to gzip with the level set. It's been a while since I tried testing this. I'll give it another go and give you some proper examples. I assume you meant IO::Uncompress::Gzip rather than IO::Uncompress::Gunzip in your example. Peter Lamb Show quoted text

-----Original Message----- From: Paul Marquess via RT [mailto:bug-IO-Compress-Zlib@rt.cpan.org] Sent: Tuesday, 28 April 2009 17:13 To: Lamb, Peter (ICT Centre, Acton) Subject: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip <URL: https://rt.cpan.org/Ticket/Display.html?id=37833 > On Mon Apr 27 21:25:40 2009, Peter.Lamb@csiro.au wrote:

> Thanks, Paul. I'll try updating my version of IO::Uncompress::Gunzip > and see if I can remove my workarounds for this problem.

Thanks - if you have any issues with the changes, please get back to me.

> I've also noticed that the Perl version of IO::Uncompress::Gzip I was > using doesn't compress as well as the gzip program does for the > same compression level settings. I've had to work around that by > using "system 'gzip', ..." because the application is in > compressing firmware for a flash device where space is tight, and I > can't afford having the compressed file grow by more than a few kB. > It currently is several tens of kB bigger (total compressed size > about 8MB). > > Any ideas?

Have you set the "Level" option when creating the gzip object? my $gz = new IO::Uncompress::Gunzip 'linux.bin.gz', Level => Z_BEST_COMPRESSION or die "Cannot gunzip: $GunzipError\n";

Tue Apr 28 03:49:42 2009 Peter.Lamb [...] csiro.au - Correspondence added

Subject:	RE: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip
Date:	Tue, 28 Apr 2009 17:49:21 +1000
To:	<bug-IO-Compress-Zlib [...] rt.cpan.org>
From:	<Peter.Lamb [...] csiro.au>

Show quoted text

> I assume you set the "Level" option when creating the gzip object?

:oops: Yes. I'll send you the information about the compression differences I see from my home email (peter_lamb2001@yahoo.com.au) rather than this one, my work email. Thanks, Peter Show quoted text

-----Original Message----- From: Paul Marquess via RT [mailto:bug-IO-Compress-Zlib@rt.cpan.org] Sent: Tuesday, 28 April 2009 17:17 To: Lamb, Peter (ICT Centre, Acton) Subject: [rt.cpan.org #37833] "Out of memory error" unzipping file in IO::Uncompress::Gunzip <URL: https://rt.cpan.org/Ticket/Display.html?id=37833 > Damn - accidentally pressed send before I had finished the reply. I assume you set the "Level" option when creating the gzip object? my $gz = new IO::Compress::Gzip 'linux.bin.gz', Level => Z_BEST_COMPRESSION or die "Cannot gunzip: $GzipError\n"; Paul

Wed Apr 29 17:12:23 2009 pmqs [...] cpan.org - Correspondence added

Hi Peter, the compression size difference between my module (which uses the zlib library) and the command-line gzip program is going to be down to the different implementations of the deflate algorithm. The gzip program was written well before the zlib library, and although the same set of people were involved in the both gzip & zlib, the implementations are not identical. Paul

Sun Jul 05 12:36:21 2009 pmqs [...] cpan.org - Status changed from 'open' to 'resolved'