Skip Menu |

This queue is for tickets about the IO-Compress CPAN distribution.

Report information
The Basics
Id: 119184
Status: resolved
Priority: 0/
Queue: IO-Compress

People
Owner: Nobody in particular
Requestors: bottomsc [...] missouri.edu
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: "Givan, Scott A." <givans [...] missouri.edu>, "Spollen, William G." <spollenw [...] missouri.edu>
Subject: Failure to read second "original" gzipped file inside a "concatenated" gzipped file
Date: Thu, 8 Dec 2016 18:08:21 +0000
To: "bug-IO-Compress [...] rt.cpan.org" <bug-IO-Compress [...] rt.cpan.org>
From: "Bottoms, Christopher A" <BottomsC [...] missouri.edu>
(This is mostly copied from my post on Stackoverflow<http://stackoverflow.com/questions/41045834>). In bash, you can concatenate gzipped files and the result is a valid gzipped file. As far as I recall, I have always been able to treat these "concatenated" gzipped files as normal gzipped files: echo 'Hello world!' > hello.txt echo 'Howdy world!' > howdy.txt gzip hello.txt gzip howdy.txt cat hello.txt.gz howdy.txt.gz > greetings.txt.gz gunzip greetings.txt.gz cat greetings.txt Which outputs Hello world! Howdy world! However, when trying to read this same file using Perl's core IO::Uncompress::Gunzip module<https://metacpan.org/pod/IO::Uncompress::Gunzip>, it doesn't get past the first original file. Here is the result: ./my_zcat greetings.txt.gz Hello world! Here is the code for my_zcat: #!/bin/env perl use strict; use warnings; use v5.10; use IO::Uncompress::Gunzip qw($GunzipError); my $file_name = shift; my $fh = IO::Uncompress::Gunzip->new($file_name) or die $GunzipError; while (defined(my $line = readline $fh)) { print $line; } If I totally decompress the files before creating a new gzipped file, I don't have this problem: zcat hello.txt.gz howdy.txt.gz | gzip > greetings_via_zcat.txt.gz ./my_zcat greetings_via_zcat.txt.gz Hello world! Howdy world! So, what is the difference between greetings.txt.gz and greetings_via_zcat.txt.gz and why might IO::Uncompress::Gunzip work correctly with greetings.txt.gz? I'm guessing that IO::Uncompress::Gunzip messes up because of the metadata between the files. But, since greetings.txt.gz is a valid Gzip file, I would expect IO::Uncompress::Gunzip to work. My workaround for now will be piping from zcat (which of course doesn't help Windows users much): #!/bin/env perl use strict; use warnings; use v5.10; my $file_name = shift; open(my $fh, '-|', "zcat $file_name"); while (defined(my $line = readline $fh)) { print $line; }
As I already mentioned on stackoverflow, this is covered in the FAQ in section "Dealing with concatenated gzip files". Marking this as resolved.